SECURITY  CLASSIFICATION  OF  THIS  PAGE  (Whan  Data  Entertei > 

REPORT  DOCUMENTATION  PAGE 


Technical  Report  No.  1 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


3 . RECIPIENT'S  CATALOG  NUMBER 


!J 


Information  Processing  Approach  to 
Performance  Assessment* 

2»  Experimental  Investigation  of  an  Information 
Processing  Performance  Batter))** 


P 


5.  TYPE  OF  R 


Technical 


yJM 


AIR-58  5#-TR . Tft-£ 


A.  M.  Rose 
K.  Fernandes 


9.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

AMERICAN  INSTITUTES  FOR  RESEARCH  ~ 
1055  Thomas  Jefferson  Street,  N.W. 
Washington,  D.C.  20007 


1 1 . CONTROLLING  OFFICE  NAME  AND  ADDRESS 

Personnel  and  Training  Research  Programs, 
Office  of  Naval  Research  - Code  458 


N#^14-76-C?0f871 


fficel  15.  SECURITY  CLASS,  lof  this  report) 

UNCLASSIFIED 


1 5a.  DECLASSI F ICATION/OOWNGR  AOING 
SCHEDULE 


Approved  for  public  release;  distribution  unlimited. 


1 7.  DISTRIBUTION  STATEMENT  (of  the  abstract  entered  in  Block  20.  if  different  from  Report! 


19  <EY  WORDS  (Continue  on  reverse  side  if  necessary  and  identify  by  block  numberl 

Personnel  Technology 
Individual  Differences 
Personnel  Assessment 
Information  Processing 
\ftognitive  Processing 


20AB$JflACT  (Contimit  on  reverse  side  if  necetsery  and  identify  by  block  number) 

^This  report  describes  the  first  study  in  a program  of  research  dealing  with 
the  development  and  validation  of  a comprehensive  standardized  test  battery  that 
can  be  used  as  an  assessment  device  for  the  evaluation  of  performance  in  a wide 
variety  of  situations.  The  standardized  battery  is  being  designed  to  possess 
high  reliability  and  predictive  validity  for  a wide  variety  of  criterion  tasks  „ — 


DO  -^3  1473  UXn  - • 

C^J  rj  W yd  SECURITY  CLASSIFICATION  OF  THIS  PAGE  (Whan  Data  Entarad ) 


UNCLASSIFIED 


ft  lira  ■.  v.m  >rrrt 


ntertd) 


20.  Abstract  (Cont'd. 


lly  important,  the  battery  is  being  designed  to  include  tests  that  possess  con- 
struct validity:  there  will  be  a firm  theoretical  and  empirical  base  for  inferring 
the  information  possessing  structures  and  functions  that  the  tests  purport  to  mea- 
sure. It  is  expected  that  such  a battery  will  permit  improved  personnel  management 
decisions  to  be  made  for  a wider  variety  of  Navy-relevant  jobs  than  is  currently 
possible  using  existing  techniques. 


The  major  purpose  of  the  present  experimental  study  was  to  determine  proper- 
ties of  the  tasks  selected  for  inclusion  in  the  test  battery.  The  primary  ques- 
tions addressed  during  this  phase  concerned  the  replicability  of  previous  findings 
and  the  adequacy  of  the  tests  to  provide  measures  of  individual  differences.  In  ad- 
dition information  concerning  the  construct  validity  of  the  tasks  and  population 
norms  for  the  resultant  measures  could  be  investigated.  With  the  relatively  large 
data  base  employed  (54  subjects),  additional  data  concerning  the  ability  of  the  set 
of  measures  to  separate  individuals  within  the  population  could  be  examined. 

The  tests  investigated  included  the  following: 

Letter  classification  (Posner  and  Mitchell) 

Lexical  decision  making  (Meyer) 

Graphemic  and  phonemic  analyses  (Bar 
Short  term  memory  scanning  (Sternberg; 

Memory  scanning  for  words  and  categories  (Juola) 

Linguistic  verification  (Clark  and  Chase) 

Recognition  memory  (Shepard  and  Teghtsoonian) 

Semantic  memory  retrieval  (Collins  and  Quillian) 

Several  questions  were  addressed  in  this  phase  of  the  research.  First,  repli- 
cability of  previous  experimental  work  with  similar  paradigms  was  investigated.  In 
general,  the  results  were  quite  compatible  with  previous  findings  for  all  eight  task: 
The  second  area  addressed  concerned  the  establishment  of  the  reliability,  validity, 
and  independence  of  the  tasks  being  studied.  In  general  the  reliabilities  for  most 
measures  was  quite  high  (r  _>  . 50) . The  measures  were  also  analyzed  to  determine 

practice  effects  and  the  character  of  the  response  distributions  in  the  population 
for  each  of  the  measures. 

In  order  to  address  validity-type  issues,  inter-  and  intra-  task  correlations 
were  calculated.  In  general,  these  analyses  support  the  construct  validity  of  the 
tasks  and  measures. 


UNCLASSIFIED 

CUfli TV  CLASSIFICATION  of  this  page  f^htn  beta  intrrrdt 


AIR-58500-TR 


I 

I 

i 

0 

Q 

0 

0 

0 


0 


An  Information  Processing  Approach 
to  Performance  Assessment: 

I.  Experimental  Investigation  of  an 
Information  Processing  Performance  Battery 

ANDREW  M.  ROSE 
KATHLEEN  FERNANDES 


u 

0 

0 

0 

0 

0 

D 

D 

D 

I 


[ MCESSI3I  IK  | 

mis 

White  SactHa 

DOC 

•off  Sectiaa  □ ' 

BMMIOUIICEt 

jiisrififimios 

□ 



IT 

OISTRIBOTtOH/AVAILaSlLITV  CPOES 

Blit. 

AVAIL,  and/ 

or  SPECIAL 

$ 

TECHNICAL  REPORT 
November  1977 


D D C 

rajzpn  nr? 

DEC  6 1977 


Eis^uuidyj 

D 


Personnel  and  Training  Research  Programs,  Psychological  Sciences  Division, 
Office  of  Naval  Research,  Arlington,  Virginia 

Contract  No.  N00014-76-C-0871 


AMERICAN  INSTITUTES  FOR  RESEARCH / 1056  Thomas  Jefferson  Street,  NW,  Washington,  DC  20007 


Approved  for  public  release;  distribution  unlimited. 

Reproduction  in  whole  or  in  pert  is  permitted  for  any  purpose  of  the  United  States  government. 


ACKNOWLEDGMENTS 


I 
E 

II 
D 
B 
0 
0 
D 
0 
fl 
D 
D 
B 


The  authors  wish  to  express  their  sincere  appreciation  to  the  Army 
Research  Institute  for  their  kind  permission  to  allow  us  use  of  their 
computer  laboratory  facilities.  In  particular,  the  assistance  of  the 
following  ARI  staff  was  vital  to  the  conduct  of  our  experiment:  Dr. 
Abraham  H.  Bimbaum,  Assistant  to  the  Director  of  Organizations  and 
Systems  Research  Laboratory  (OSRL) : Mr.  John  R.  Schjelderup,  Mr.  Joseph 
P.  Severo,  Mr.  Arthur  Lynch,  and  Mr.  Charles  Marshall,  all  of  the  Re- 
search Support  Group  under  OSRL;  and  Mr.  Donald  E.  Andre,  Computer 
Operations  Supervisor  of  the  data  processing  center. 

Also,  we  would  like  to  acknowledge  the  assistance  of  Dr.  David  E. 
Meyer,  Dr*  James  F.  Juola,  Dr.  Jonathan  Baron,  and  Dr.  Allan  M.  Collins, 
who  provided  us  with  stimulus  materials,  instructions,  and  other  ma- 
terials useful  in  the  conduct  of  our  study. 

Finally,  we  would  like  to  acknowledge  the  contribution  of  Dr.  Paul 
W.  Fingerman  of  AIR,  who  was  the  principal  author  and  constructor  of 
the  computer  programs  used  in  the  real-time  conduct  of  the  experiment. 


TABLE  OF  CONTENTS 


I 

I 

H 


0 

8 

0 

0 

0 

8 

8 

I 


Section  Page 

ACKNOWLEDGEMENTS ii 

TABLE  OF  CONTENTS iii 

LIST  OF  TABLES  iv 

LIST  OF  FIGURES  v 

INTRODUCTION 1 

General  Task  Overview 5 

Task  Descriptions  7 

Stannary 17 

METHOD 19 

Testing  Facilities  19 

Procedure 19a 

Subjects 19a 

Posner  Task 19b 

Meyer  Task 20 

Baron  Task 22 

Sternberg  Task  23 

Juola  Task 24 

Clark  and  Chase  Task 26 

Collins  and  Quillian  Task  27 

Shepard  and  Teghtsoonian  Task  29 

RESULTS  AND  DISCUSSION  31 

Replications  of  Group  Effects  31 

Individual  Measures  49 

Construct  Validity  54 

REFERENCES 76 

APPENDIX  A 78 


ill 


LIST  OF  TABLES 


Table  No.  Page  No. 

1 Operational  Overview  of  Tasks 18 

2 Parameter  Estimates  for  Meyer  Task 34 

3 Summary  of  Selected  Measures  for  Baron 

Task  in  Mean  RT/item  (msec) 36 

4 Slopes  and  Intercepts  (msec)  of  the  Best-Fitting 
Linear  Functions  Relating  to  Mean  RTs  to  Memory 

Set  Size  in  Juola  Tasks 41 

5 Breakdown  of  Latencies  for  Eight  Types  of 

Sentences  from  Clark  and  Chase  Task 43 

6 Selected  Shepard  and  Teghtsoonian  Task 

Parameters 48 

7 Operations  for  Each  Task  Measure  50 

8 Test-Retest  Reliabilities 52 

9 Measures  Showing  Significant  and  Nonsignificant 

Practice  Effects  (p  £ .05) 53 

10  Descriptive  Measures  and  Frequency  Polygons 55 


11-A  Inter-  and  Intratask  Correlations  for  Day  1 . . . . 70 

11-B  Inter-  and  Intratask  Correlations  for  Day  2 ...  . 71 


0 

0 

0 


Figure  No. 


LIST  OF  FIGURES 


Tree  diagram  for  letter  classification 
task 


Page  No. 


Day  1 and  Day  2 mean  RTs  on  positive  and 
negative  trials  for  Sternberg  task.  . . . 


Best-fit  regression  lines  for  Juola  and 
Atkinson  task  


Mean  RTs  (msec)  for  Clark  and  Chase  Task 44 

Mean  RTs  (msec)  and  best-fit  regression 

lines  for  Collins  and  Quillian  task 45 

Lag  function  for  Shepard  and  Teghtsoonian 

task 47 


INTRODUCTION 

In  order  to  improve  the  accuracy  of  personnel  selection,  classifica- 
tion, guidance,  and  the  design  of  training  programs,  procedures  for  making 
meaningful  evaluations  or  differentiations  among  people  with  respect  to 
relevant  characteristics  must  be  established.  For  example,  different  oc- 
cupations and  professions  place  differential  demands  upon  human  cognitive 
abilities.  It  follows,  therefore,  that  maximization  of  the  fit  between 
people  and  jobs  necessitates  both  the  specification  of  essential  require- 
ments for  each  type  of  work  and  the  assessment  of  personnel  with  respect 
to  the  characteristics  needed  on  the  several  jobs.  As  a consequence  the 
central  question  of  manpower  utilization  revolves  around  performance  as- 
sessment and  individual  differences:  What  are  the  performance  require- 
ments for  a particular  task  or  job,  and  how  can  the  person  be  found  or 
trained  who  best  matches  those  requirements? 

Almost  since  its  inception  approximately  one  hundred  years  ago  one  of 
the  major  concerns  of  scientific  psychology  has  been  the  assessment  and 
evaluation  of  individuals  along  a wide  variety  of  dimensions.  In  fact, 
despite  their  controversial  status  in  recent  times,  it  can  be  argued  that 
personnel  and  evaluation  tests  are  the  most  widespread  contributions  of 
the  science  of  psychology,  and  possibly  the  most  important  practical  utili- 
zations of  empirically  derived  information  concerning  human  performance. 

However,  with  the  growing  maturity  of  empirical  psychology,  personnel 
tests  and  other  similar  tests  have  been  the  source  of  a certain  amount  of 
dissatisfaction,  particularly  among  experimental  psychologists.  Much  of 
this  dissatisfaction  can  be  directly  attributed  to  the  unanalytical  (with 
respect  to  underlying  and  more  basic  kinds  of  performance)  nature  of  most 
personnel  tests.  That  is,  from  the  perspective  of  many,  most  notably  those 
subscribing  to  an  information-processing  theoretical  framework,  the  types 
of  tasks  represented  in  global  tests  such  as  those  purporting  to  measure 
"job  performance"  too  frequently  are  either  extremely  difficult  to  categor- 
ize as  to  precisely  what  is  being  measured,  or  else  involve  a not  easily 
interpretable  mixture  of  two  or  more  identifiable  processes  or  stages. 


r 


1 


An  initial  attempt  to  develop  a more  analytical  and  incisive  battery 
of  information-processing  performance  tests  has  been  reported  by  Rose  (1974). 
Rose  compiled  a series  of  tests  designed  to  be  useful  in  assessing  individ- 
ual differences  in  a wide  variety  of  information-processing  skills  and  ca- 
pacities. Some  of  the  criteria  implicit  in  the  initial  selection  of  tests 
for  this  Information  Processing  Performance  Battery  (IPPB)  were: 


(1)  the  tests  should  measure  specific  processes; 


I 

I 

I 

0 

fl 

H 

l 

E 

1 


(2)  the  tests  should  be  sensitive  and  yet  short  and 
easy  to  administer; 

(3)  the  tests  should  yield  reliable  measures  of  per- 
formance ; 

(4)  the  tests  should  be  statistically  independent; 

(5)  the  tests  should  measure  basic  abilities  and 
hence  must  be  relatively  abstract  to  minimize 
effects  of  prior  experience;  and 

(6)  taken  as  a whole,  the  tests  in  the  battery  should 
not  be  "criterion-based",  i.e.,  the  tests  should 
not  be  selected  with  the  aim  of  predicting  one 
particular  type  of  job  or  activity. 


After  carefully  selecting  a series  of  nine  tests  that  satisfied  these 
criteria,  Rose  administered  the  IPPB  for  three  sessions  each  to  100  college- 
age  subjects  --  50  males  and  50  females.  Extensive  correlational  analyses 
were  conducted  on  the  data  resulting  from  these  administrations  to  deter- 
mine the  degree  of  relationship  among  the  various  tests  and  the  reliabili- 
ties of  each  test. 

This  work  has  been  extended  in  a current  ONR- sponsored  research  pro- 
gram being  conducted  at  the  American  Institutes  for  Research.  The  basic 
objective  of  this  work  is  to  further  develop  and  validate  the  IPPB  so  that 
eventually  it  can  be  used  as  an  assessment  device  for  the  evaluation  of 
performance  in  a wide  variety  of  situations.  However,  at  the  initiation  of 
the  present  project,  the  IPPB  tapped  only  a limited  number  of  information - 


j processing  functions  and  structures. 

I 

I 


In  order  to  increase  the  scope  of  the 


2 


T*- 


0 

0 


II 


0 

0 

0 

D 

0 


U 

0 

0 

0 

D 

0 

E 

I 

I 


functions  represented,  it  was  considered  necessary  to  refine  and  expand  the 
existing  tests  in  the  battery.  The  general  strategy  employed  in  the  devel- 
opment of  the  IPPB  is  described  in  the  following  paragraphs. 

The  relevant  literature  of  the  past  several  years  was  carefully  and 
extensively  reviewed  in  order  to  locate  major  omissions  as  a basis  for 
major  revision  of  the  IPPB.  Several  methods  were  used.  First,  "standard" 
computer-based  search  and  re trieval  systems  such  as  PASAR  and  MEDLARS  were 
employed  to  identify  relevant  references.  Second,  a number  of  recently 
published  texts  on  information  processing  were  reviewed,  as  were  books  of 
readings  on  memory  and  cognition.  Given  the  growth  of  literature  relating 
to  information  processing  during  the  past  several  years,  a review  of  selec- 
ted secondary  sources  was  considered  to  be  an  economical  way  of  obtaining 
an  overview  of  the  research  that  had  been  accomplished  since  the  IPPB  was 
developed.  A third  procedure  was  to  collect  references  dealing  with  parti- 
cular information-processing  constructs,  processes,  or  tasks  that  were  not 
represented  previously.  Toward  this  end  a narrow- focus,  topic- specific 
literature  search  was  initiated. 

In  using  the  three  literature  identification  procedures,  special  con- 
sideration was  given  to  major  areas  of  information  processing  in  which  a 
large  body  of  research  has  accumulated  since  the  IPPB  was  developed.  These 
areas  included  memory,  psycholinguistics,  and  visual  information  processing. 
Within  each  of  these  areas,  relevant  paradigms  and  constructs  were  further 
evaluated  with  several  criteria  in  mind: 

1.  The  information-processing  construct  or  concept  had  to  have  a his- 
tory of  empirical  and/or  theoretical  support.  The  interest  here  was  in  con- 
structs that  had  been  developed  over  a period  of  time  and  in  research  para- 
digms that  had  been  replicated  under  a variety  of  conditions.  This  criterion 
was  relaxed  only  in  instances  where  a paradigm  was  considered  to  be  a "classic" 
measure  of  a particular  construct  but  where  no  evidence  of  replication  could 

be  found  in  the  literature. 

2.  There  had  to  be  an  adequate  theoretical  rationale  for  the  paradigm 
actually  measuring  the  particular  information-processing  construct  that  it 





3 


. _ |-|1|,.  -- i in-  irn~iTi1irB<tfiMJfitf^itt<lifil>ii<Mi^MiMirtJll:tVm  -itr-*4,  ■ f •'■r  ' 

■ -■  - - - - - • - 


I U 


I n 


was  intended  to  measure.  The  focus  was  on  construct  validity  rather  than 
theoretical  sophistication.  Studies  concerned  primarily  with  the  develop- 
ment of  mathematical  models  for  certain  operations,  with  the  task  itself  of 
only  ancillary  relevance,  were  excluded  from  further  consideration. 

3.  The  experimental  task  itself  had  to  be  one  that  was  adaptable  to  a 
paper -and -pencil  format,  to  a small  digital  computer,  or  to  some  other  form 
that  could  be  easily  administered  in  a group  setting. 

4.  Enough  performance  data  had  to  be  available  so  that  preliminary 
estimates  could  be  made  regarding  the  extent  of  individual  variation  ex- 
pected for  the  task. 

The  result  of  the  screening  activity  was  a set  of  15  tasks  that  seemed 
to  be  prime  candidates  for  inclusion  in  the  IPPB.  These  tasks  were  adapted 
or  modified  into  practical  formats,  and  the  methodological  refinements  were 
evaluated  in  a series  of  in-house,  informal  pilot  studies  to  determine  the 
feasibility  of  alternative  adaptations  of  tasks,  instructions,  stimuli,  and 
timing.  At  the  completion  of  these  studies,  all  of  the  tasks  had  been  eval- 
uated to  determine  their  empirical  and  theoretical  support,  logical  feasi- 
bility, reliability,  and,  to  a limited  degree,  their  construct  validity. 

As  a result  of  this  evaluation,  eight  tasks  were  retained  and  considered 
worthy  of  more  extensive  experimental  investigation. 

The  next  phase  of  the  present  project  was  to  determine  the  properties 
of  the  tasks  when  they  were  assembled  into  a research  battery.  The  primary 
questions  addressed  during  this  phase  concerned  the  replicability  of  previ- 
ous findings  and  the  adequacy  of  the  tasks  to  provide  measures  of  individual 
differences.  In  addition,  information  concerning  the  construct  validity  of 
the  tasks  and  sample  norms  for  the  resultant  measures  could  be  investigated. 
With  the  relatively  large  data  base  employed,  additional  issues  concerning 
the  ability  of  the  set  of  measures  to  separate  individuals  within  the  popu- 
lation could  be  examined. 

The  next  sections  provide  a detailed  description  of  the  conduct  of  the 
experiment  designed  to  fulfill  those  purposes.  First,  the  eight  tasks  in- 
cluded in  the  experiment  are  described.  This  presentation  is  followed  by 
a description  of  task  measures,  experimental  method,  and  results. 


General  Task  Overview 


During  the  course  of  task  selection  for  the  first  phases  of  this  pro- 
ject, we  have  attempted  to  remain  unbiased  toward  any  general  or  specific 
information-processing  model  or  theory.  It  has  not  been  our  goal  to  develop 
a general  model  of  human  information  processing,  nor  has  our  task  selection 
been  organized  around  or  directed  towards  a new  "structure  of  intellect." 

In  this  regard,  we  echo  the  opinions  expressed  by  Melton  (1967,  p.  241): 

We  have  at  this  time  no  general  theory  of  human  learning 
and  performance.  Therefore,  we  have  no  necessary  and 
sufficient  list  of  process  constructs  or  variables  that 
can  serve  as  the  foci  of  individual -differences  research 
....  There  is  no  magic  whereby  the  processes  that  should 
be  examined  with  respect  to  individual  differences  can  be 
identified.  The  process  concepts  to  be  examined  will  de- 
pend on  the  level  of  analysis  that  our  theoretical -ex- 
perimental approach  has  achieved  and  on  the  level  of  anal- 
ysis and  range  of  task  variables  that  the  theoretical 
model  attempts  to  encompass,  (p.  241  ff) 


I 


Nevertheless,  as  a result  of  our  task  selection  criteria  and  pilot 
task-selection  efforts,  a post-hoc  organizational  structure  has  manifested 
itself.  This  structure  is  conceptually  useful  in  delimiting  the  scope  of 
tasks  included  in  the  IPPB  as  well  as  in  providing  a basis  upon  which  con- 
struct validity  issues  can  be  discussed. 

Carroll  (1974),  in  an  initial  attempt  to  characterize  a set  of  factor- 

] 

analytically-derived  abilities  in  terms  of  cognitive  processes,  developed 
an  organizational  system  of  information-processing  "operations"  that  is  con- 
ceptually similar  to  the  simple  structure  presented  here.  He  defines  oper- 
ations as  "control  processes  that  are  explicitly  specified,  or  implied,  in 
the  task  instructions  . . . and  that  must  be  performed  if  the  task  is  to  be 
successfully  completed."  These  operations  are  of  three  types:  attentional, 
memorial,  and  executive.  Of  particular  interest  are  the  latter  two,  which 
he  further  subdivided  as  follows:  There  are  three  kinds  of  memorial  oper- 
ations: storing,  searching , and  retrieving.  Executive  operations  are  ex- 


5 

.......  - ...  .., ^.T,  ..U  ... . ■■■  ......  . . 


I 


a 

D 

D 

u 

D 


0 

B 


enrplified  by  such  things  as:  simple  judgments  of  stimulus  attributes  such 
as  to  reveal  identity,  similarity,  or  comparison  between  two  stimuli;  mani- 
pulations of  memorial  contents,  such  as  "mentally  rotating"  a visuospatial 
configuration;  and  information  transformations  which  produce  "new"  elements 
from  combinations,  reductions,  etc.,  of  old  elements.  The  second  dimension 
of  this  organization  is  what  he  calls  "modality  or  contents"  --  basically, 
the  form  in  which  the  stimuli  are  processed  by  the  operations.  While  Carroll's 
organization  was  developed  in  a different  context  than  the  present  project,  it 
is  a good  illustration  of  the  level  of  analysis  that  we  will  continue  to  use. 

In  fact,  we  will  retain  several  of  the  above  distinctions. 

In  general,  all  tasks  included  in  the  present  study  can  be  described  as 
a series  of  operations,  where  an  operation  is  defined  as  above.  Each  task 
can  be  specified  by  some  combination  of  eight  such  operations.  These  oper- 
ations are  described  below. 

Encoding : the  operation  by  which  information  is  input  into  the  system, 
and  including  the  initial  set  of  processes  that  converts  the  physical  stimulus 
to  a form  which  is  "appropriate"  for  the  task.  Different  task  demands  may 
require  different  levels  of  analysis  of  the  stimulus.  Posner  (1969)  has  called 
this  dimension  "abstraction"  --  the  process  by  which  different  types  of  infor- 
mation about  the  stimulus  are  extracted;  in  other  words,  the  level  of  stimulus 
analysis  demanded  by  the  task.  For  example,  a visual  search  task  might  require 
only  that  the  subject  extract  physical  or  structural  information  about  the 
stimulus,  a memory  search  task  might  require  the  extraction  of  name  information, 
and  a semantic  search  task  might  necessitate  semantic  or  "meaning"  information. 

Constructing : the  operation  by  which  new  information  structures  are  gen- 
erated from  information  already  in  the  system.  This  is  what  Neisser  (1967) 
and  others  have  called  "synthesis";  in  the  present  context,  we  will  limit  the 
use  to  situations  where  additional  features  of  the  stimuli  must  be  abstracted, 
beyond  those  initially  encoded. 

Transforming:  the  operation  by  which  a given  information  structure  is 
converted  into  an  equivalent  structure  necessary  for  task  performance.  In 
contrast  to  constructing,  transformations  do  not  involve  any  new  information 


*, 


■ 


6 


P 

P 

0 

0 


abstraction;  rather,  this  operation  requires  the  application  of  some  stored 
rules  to  the  information  structure  already  present. 

Storing : the  operation  by  which  new  information  is  incorporated  into 
existing  information  structures,  while  its  entire  content  is  retained. 

Retrieving : the  operation  by  which  previously  stored  information  is 
made  available  to  the  processing  system. 

Searching : the  operation  by  which  an  information  structure  is  examined 
for  the  presence  or  absence  of  one  or  more  properties.  The  information 
structure  examined  may  be  one  already  in  the  processing  system  or  one  external 
to  it  (e.g.,  a visual  array). 

Comparing : the  operation  by  which  two  information  structures  (again, 
either  internal  or  external  to  the  processing  system)  are  judged  to  be  the 
same  or  different.  The  information  structures  need  not  both  be  physical  en- 
tities (as  in  the  comparison  of  two  objects);  likewise,  a physical  entity  can 
be  compared  to  a stored  representation  or  description  in  order  to  determine 
identity. 

Responding : the  operation  by  which  the  appropriate  (motor)  action  is 
selected  and  executed.  In  many  information-processing  investigations,  the 
response  operation  is  itself  the  object  of  study.  Various  microprocesses 
have  been  uncovered;  however,  the  current  study  was  designed  to  minimize 
performance  variability  due  to  differential  response  demands  of  the  tasks. 

Each  of  the  eight  tasks  included  in  the  present  study  will  be  described 
first,  in  terminology  employed  by  the  particular  investigators,  and  second,  as 
a function  of  some  of  the  operations  just  elucidated.  These  latter  descrip- 
tions do  not,  of  course,  represent  the  original  authors'  conceptions  of  the 
paradigms. 

Task  Descriptions* 

Letter  classification  (Posner  task).  The  process  of  matching  or  recog- 
nition at  various  levels  of  stimulus  complexity  is  basic  to  most  cognitive 

* For  ease  of  discussion  in  this  section  and  in  following  sections,  a short- 
ened label  for  each  of  these  tasks  will  be  used,  namely  the  principal  author's 
name.  Thus,  the  letter  classification  task  will  be  addressed  as  the  "Posner" 
task,  etc. 


7 


1 

i 


tasks.  Posner  and  Mitchell  (1967)  developed  an  experimental  paradigm  which 
"provides  an  opportunity  to  observe  processing  at  different  levels  within 
the  experiment."  In  their  task,  the  subject  was  shown  pairs  of  letters  and 
had  to  decide  whether  the  letters  were  the  same  or  different.  The  indepen- 
dent variable  was  the  instruction  upon  which  the  subject  was  told  to  make 
the  classification.  The  instructions  used  to  define  "same"  were: 

1.  Physical  identity  (e.g.,  the  pair  AA  is  to  be  classified 
as  "same"  while  AB  is  "different");  or 

2.  Name  identity  (e.g.,  Aa  is  "same",  Ab  is  "different"),  or 

3.  Category  or  rule  identity  (e.g.,  if  the  rule  is  one  of 
letter  category,  a stimulus  pair  is  "same"  if  both  members 
are  vowels  or  if  both  members  are  consonants,  such  as  AE 
or  BD). 

The  typical  findings  were  that  the  classification  reaction  times  (RTs) 
increased  as  the  instructions  varied  in  the  above  order.  The  ordinal  re- 
lationships among  these  processing  "nodes"  (and  the  time  differences  be- 
tween them)  were  quite  reliable  and  have  been  demonstrated  to  generalize 
to  other  stimuli  (e.g.,  numbers,  Gibson  figures).  It  also  has  been  shown 
that  these  types  of  classifications  are  serial  (i.e.,  subjects  derive  the 
name  of  the  letters  before  proceeding  to  analyze  whether  they  are  both  vowels 
or  both  consonants) . 

Hie  procedure  in  the  current  project  used  the  three  Posner  and  Mitchell 
experimental  conditions.  The  category  identity  condition  was  modified  slight- 
ly in  that  the  vowel  category  and  the  consonant  category  were  tested  con- 
jointly. In  the  Posner  and  Mitchell  study,  the  two  categories  were  tested 
in  separate  blocks  of  trials. 

In  terms  of  our  operations,  the  three  conditions  all  involve  an  initial 
encoding  of  letters.  In  the  physical  match  case,  the  subject  then  compares 
the  representations  of  the  letter  patterns,  and  finally  selects  and  executes 
the  appropriate  response.  The  name  case  requires  an  additional  operation  of 
retrieval  of  "name"  information  from  long-term  memory  (LTM).  Subjects  then 


8 


. • 


compare  the  letter  names.  In  the  rule  condition,  experimental  evidence  in- 
dicates that  subjects  retrieve  name  information  prior  to  retrieving  cate- 
gorical information  (i.e.,  subjects  categorize  names  of  letters  as  vowels 
or  consonants  rather  than  the  physical  patterns).  They  then  compare  the 
representations  of  the  letter  categories. 

Lexical  decision  making  (Meyer  task).  Rubens tein,  Garfield,  and  Milli- 
kan (1970)  developed  a procedure  designed  to  investigate  the  processes  by 
which  humans  can  recognize  written  words.  On  each  trial  in  their  paradigm, 
a string  of  letters  was  presented  and  the  subject  had  to  judge  whether  it 
was  an  English  word  or  nonword.  Performance  on  this  lexical  decision  task 
depended  on  operations  that  mediated  the  recognition  of  printed  words  in 
various  contexts  --that  is,  graphemic  and/or  phonemic  encoding,  followed  by 
accessing  of  lexical  memory.  Various  investigators  have  argued  that  printed 
words  are  recognized  directly  from  visual  representations  (graphemes) , whle 
others  have  claimed  that  recognition  is  mediated  by  a phonological  (phonemic) 
representation. 

The  Rubenstein  et  al.  procedure  has  been  modified  by  Meyer  (e.g.,  Meyer, 
Schvaneveldt,  and  Ruddy,  1974)  in  order  to  separate  the  effects  of  graphemic 
and  phonemic  factors  on  recognition.  As  in  the  Rubenstein  et  al.  experiments, 
subjects  were  presented  with  two  strings  of  letters,  displayed  successively, 
on  each  trial.  Reaction  time  (RT)  was  measured  for  each  string  separately. 

The  critical  variables  were  the  graphemic  and  phonemic  relations  within  the 
pairs  of  words.  For  example,  the  words  could  be  both  graphemically  and  phon- 
emically  similar  (e.g.,  BRIBE  - TRIBE),  graphemically  similar  but  phonemic- 
ally  dissimilar  because  they  do  not  rhyme  (e.g.,  CCUCH  - TOUCH),  and  so  on. 
Meyer  et  al.  formulated  and  tested  various  hypotheses  concerning  the  relative 
speed  of  recognition  for  word  pairs;  for  example,  it  was  found  that  graphemic 
similarity  alone  inhibited  performance  (e.g.,  in  the  pair  CCUCH-TCUCH,  RT  to 
TOUCH  was  slower  than  predicted  from  baseline  control  conditions) . In  con- 
trast, phonemic  as  well  as  graphemic  similarity  facilitated  recognition  (e.g., 
in  the  pair  BRIBE-TRIBE,  RT  to  TRIBE  was  faster  than  to  the  second  word  of 
graphemically  and  phonemically  dissimilar  word  pairs) . 


9 


y 

u 

a 

o 

a 

a 

a 

o 

o 

D 

D 

D 

D 

D 

0 

D 

D 

e 


The  Meyer  et  al.  paradigm  was  modified  in  the  present  study  to  in- 
clude a category  of  phanemically  similar  but  graphemically  dissimilar  word 
pairs  (e.g.,  LAUGH-HALF). 

In  terms  of  operations,  both  the  "word"  and  "nonword"  stimulus  pre- 
sentations require  the  subject  to  encode  a letter  string;  the  "mediation" 
hypothesis  can  be  formulated  as  the  (optional)  construction  of  a phonemic 
or  graphemic  representation.  Presumably,  this  paradigm  will  identify  those 
subjects  who  have  a propensity  for  one  construction  or  the  other.  Follow- 
ing this  construction,  both  conditions  require  the  search  of  LTM  for  a 
match.  When  a match  is  found  (in  the  word  case) , subjects  select  and  exe- 
cute the  appropriate  response.  If  a match  is  not  found  (in  the  nonword 
case),  it  is  hypothesized  that  subjects  conduct  a further  search  --  this 
time,  of  the  lexical  rules  in  LTM  in  order  to  decide  whether  or  not  a let- 
ter string  is  an  acceptable  construction.  Following  this  search,  subjects 
select  and  execute  the  appropriate  response. 

Graphemic  and  phonemic  analysis  (Baron  task).  Baron  (1973;  Baron  5 
McKillop,  1975)  has  developed  a procedure  for  the  study  of  individual  dif- 
ferences in  the  speed  of  phonemic  (acoustical)  and  graphemic  (visual)  analysis 
of  printed  information  (e.g.,  sentences  or  phrases).  He  argued  that  lexical 
memory  can  be  accessed  through  both  visual  and  phonological  representations 
of  a printed  word;  also,  he  argued  that  a visual  analysis  is  the  faster  of 
the  two  for  normal  readers.  The  basic  paradigm  used  in  his  studies  was  to 
"force"  subjects  to  analyze  phrases  visually  and  phonological ly.  More  speci- 
fically, he  had  subjects  decide  whether  various  printed  phrases  made  sense  or 
were  nonsense.  Three  conditions  were  required.  In  the  first  condition  two 
kinds  of  phrases  were  used:  sense  (S)  phrases,  and  those  which  sounded  sensi- 
ble because  of  a homophone  (e.g.,  IT'S  KNOT  SO)  but  looked  like  nonsense 
(called  H phrases) . In  this  first  condition  (SH) , subjects  were  instructed 
to  classify  a phrase  as  making  sense  or  nonsense  on  the  basis  of  its  appear- 
ance (so  that  H phrases  were  judged  as  nonsense) . The  second  condition  used 
H phrases  and  true  nonsense  (N)  phrases  (e.g.,  NEW  I CAN'T).  In  this  second 
condition  (HN),  subjects  were  instructed  to  classify  the  phrases  on  the  basis 
of  how  they  sounded,  so  that  H phrases  were  judged  as  making  sense.  The  third 


10 


1 

I 

I 

0 

0 

0 

0 

0 

0 

D 

0 

0 

Q 

0 

D 

0 

B 

I 


condition  used  S and  N phrases.  In  this  third  condition  (SN)  subjects  were 
free  to  choose  whatever  basis  they  preferred  for  making  S and  N judgments. 

The  basic  analysis  was  to  determine  which  of  the  first  two  conditions  better 
predicted  the  third  condition.  For  example,  if  a particular  subject  was  a 
"visual"  encoder,  he  should  have  had  "problems"  with  the  HN  condition  and 
his  SH  performance  should  have  been  a good  predictor  of  SN  performance. 

The  results  (as  reported  in  Baron  and  McKillop,  1975)  indicated  the  exist- 
ence of  reliable  and  predictable  individual  differences:  some  subjects 
were  "visual",  others  "phonemic"  encoders. 

The  procedure  in  the  current  study  used  Baron's  three  conditions. 

However,  we  obtained  RTs  on  a trial -by- trial  basis  rather  than  after  a 
trial  block. 

In  terms  of  operations,  all  three  conditions  (SH,  HN,  and  SN)  require 
the  subjects  to  encode  semantic  phrases.  Following  this  encoding,  the 
different  conditions  force  subjects  to  construct  either  visual  or  acoustic 
representations  of  each  phrase:  the  SH  condition  needs  a visual  represen- 
tation, the  HN  condition  needs  an  acoustic  representation,  and  the  SN  re- 
quires either  an  acoustic  or  a visual  representation.  Following  this  con- 
struction, subjects  must  search  LTM  for  what  we  will  call  "phrase  rules"  -- 
that  set  of  information  or  rules  that  enables  them  to  decide  whether  or  not 
a phrase  meets  acceptable  language  structure  rules.  In  all  conditions,  sub- 
jects then  select  and  execute  the  appropriate  response. 

Short-term  memory  scanning  (Sternberg  task).  Sternberg  (1967,  1969) 
developed  an  experimental  paradigm  to  "study  the  ways  in  which  information 
is  retrieved  from  memory  when  learning  and  retention  are  essentially  perfect" 
(Sternberg,  1969,  p.  423).  The  general  procedure  was  to  present  a list  of 
items  for  memorization  that  was  short  enough  to  be  within  the  inriediate  memory 
span  (typically,  this  "memory  set"  contained  1-4  items).  Next,  the  subject 
was  asked  a question  about  the  memorized  list  (again,  typically,  the  question 
concerned  the  presence  or  absence  of  a stimulus  from  the  memorized  set),  and 
his  delay  in  responding  to  the  question  was  measured.  The  particular  mani- 
festation of  this  general  procedure  used  in  the  current  work  was  the  "item- 


11 


recognition  tisk."  The  stimulus  ensemble  consisted  of  the  digits  1 through 
9.  On  each  trial,  a set  of  digits  was  selected  arbitrarily  and  was  defined 
as  the  positive  or  memory  set.  After  a short  pause,  a test  stimulus  (a 
single  digit)  was  presented.  The  subject  had  to  decide  whether  the  test 
digit  was  a member  of  the  positive  set.  Performance  was  measured  in  terms 
of  RT  from  test- stimulus  onset  to  response. 

The  typical  findings  were  that  the  functions  relating  RT  to  memory  set 
size  are  approximately  linear,  and  with  roughly  equal  slopes  for  positive 
and  negative  responses.  This  outcome  has  been  observed  in  many  different 
situations,  including  differences  in  stimulus  ensemble,  subject  group  dif- 
ferences, and  memory  set  sizes.  The  paradigm  also  resulted  in  reliable  in- 
dividual differences  with  respect  to  the  slope  and  intercept  parameters  of 
the  RT  by  memory- set- size  function.  The  procedure  in  the  current  study  was 
essentially  a replication  of  Sternberg's  "varied  set"  procedure,  wherein 
the  memory  set  was  changed  from  trial  to  trial. 

In  terms  of  operations,  we  will  consider  those  events  that  take  place 
starting  from  the  presentation  of  the  target  number,  since  it  is  assumed 
that  this  paradigm  does  not  measure  any  aspect  of  storing  or  retrieving 
information.  Thus,  when  the  target  stimulus  is  presented,  subjects  must 
encode  the  number.  Following  this  encoding  (which  may  be  visual  or  acoustic, 
depending  upon  the  nature  of  the  representation  of  the  memory  set),  subjects 
compare  the  target  with  the  memory  set.  This  comparison  is  (generally)  ac- 
complished in  a serial,  exhaustive  manner  --  all  items  in  the  memory  set 
are  compared  prior  to  the  selection  and  execution  of  the  appropriate  response. 

Memory  scanning  for  words  and  categories  (Juola  task) . Memory  search 
processes  for  word  names  and  for  categorical  information  about  words  were 
investigated  in  an  experiment  by  Juola  and  Atkinson  (1971) . They  used  a 
short-term  memory  search  paradigm  similar  to  that  used  by  Sternberg  (1967) 
in  which  a short  list  of  items  was  presented,  followed  by  a single  probe 
item  that  might  or  might  not  be  a member  of  the  memorized  list.  Two  major 
conditions  were  run  in  the  Juola  and  Atkinson  study:  a "word  scan"  condi- 
tion and  a "category  scan"  condition.  In  the  first  condition,  the  memory 


12 


n 

o 

Q 

o 

D 

0 

0 

D 

D 

0 

0 

0 

Q 


set  consisted  of  from  one  to  four  different  words.  A positive  probe  stimu- 
lus was  one  of  the  words  in  the  memorized  list,  while  a negative  probe  was 
a word  that  did  not  match  any  of  the  memory  set  words.  Thus,  this  condition 
was  essentially  a replication  of  the  Sternberg  paradigm  using  words  rather 
than  numbers.  The  second  condition  in  the  Juola  and  Atkinson  study  also  in- 
volved a memory  set  of  from  one  to  four  words;  however,  these  words  were 
semantic  category  labels  (e.g.,  COLOR,  RELATIVE,  etc.).  Positive  probe 
stimuli  were  instances  of  one  of  the  memory  set  categories  (e.g.,  if  the 
memory  set  was  COLOR,  RELATIVE,  a positive  probe  might  be  BLUE). 

The  results  of  this  experiment  (and  a replication  by  Juola  and  McDermott, 
1976)  showed  an  increase  in  response  time  with  the  number  of  memory  set  items 
in  both  conditions.  Furthermore,  when  linear  functions  were  fit  to  the  data, 
the  functions  had  equivalent  intercepts  for  the  two  conditions,  but  the  slope 
was  much  greater  for  the  categorization  condition.  The  authors  argued  that 
the  comparability  of  intercepts  indicated  that  categorization  and  comparison 
involve  many  similar  processes  that  do  not  depend  upon  the  size  of  the  memory 
set  (e.g.,  probe  word  encoding,  response  decision  and  execution),  while  a dif- 
ference in  slope  indicated  that  fundamentally  different  types  of  search  or 
comparison  processes  are  involved  in  the  two  conditions. 

The  procedure  used  in  the  present  research  was  a modification  of  the 
Juola  and  Atkinson  task  in  that:  (1)  the  same  category  labels  were  used  in 
both  conditions;  (2)  a relatively  small  set  of  categories  was  employed;  (3) 
several  exemplars  of  each  category  were  used  in  the  categorization  condition; 
and  (4)  negative  probes  were  members  of  other  categories  used  as  memory  set 
items. 

In  terms  of  operations,  the  "word  scan"  condition  is  essentially  equi- 
valent to  the  Sternberg  task  in  that  it  requires  the  encoding  of  the  target 
stimulus  (in  this  case,  a word  rather  than  a number),  followed  by  a serial 
comparison  of  that  representation  with  the  items  in  the  memory  set,  and  the 
selection  and  execution  of  the  appropriate  response.  The  "category  scan" 
condition  requires  an  additional  operation  --  the  retrieval  of  categorical 
information  from  LTM.  The  pattern  of  their  results  indicates  that  this 


13 

. .... ......  ......... ............. ...  ai 


retrieval  operation  is  performed  each  time  the  target  word  is  compared  to  a 
member  of  the  memory  set,  rather  than  just  once.  The  results  also  suggest 
that  the  comparison  operation  is  serial  and  self-terminating  in  this  con- 
dition, rather  than  exhaustive. 

Linguistic  verification  (Clark  task).  Clark  and  Chase  (1972)  developed 
and  tested  a model  to  account  for  how  people  compare  information  from  lingui- 
stic and  pictorial  sources.  Their  model  applied  to  a particular  type  of  sen- 
tence verification  task  in  which  the  subject  was  presented  with  a display  con- 
taining a sentence  and  a picture.  The  sentence  was  of  the  form  "star  (plus) 

* + 

is  (is  not)  above  (below)  plus  (star)"  and  the  picture  was  either  + or  *.  The 
subject  had  to  decide  whether  the  sentence  was  a true  or  false  description  of 
the  picture.  The  model  accounted  for  the  latencies  of  the  subject's  judgments 
in  terms  of  four  operations  or  stages  (sentence  encoding,  picture  encoding, 
comparing,  and  responding)  which  were  serially  ordered,  with  component  laten- 
cies that  were  additive.  The  subject  formed  internal  representations  of  the 
sentence  and  the  picture  in  terms  of  their  underlying  propositions  and  then 
performed  a series  of  comparison  operations  to  check  the  overall  congruence 
of  the  representations.  Clark  and  Chase  found  that  verification  time  con- 
sisted of  the  addition  of  one  or  more  of  four  parameters  that  accounted  for 
99.8  percent  of  the  variance  in  response  latencies. 

The  procedure  in  the  current  study  was  a replication  of  the  sentence 
verification  task  as  used  by  Clark  and  Chase. 

The  above  description  is  compatible  with  the  operations  terminology 
used  here  in  that  each  type  of  sentence  requires  an  encoding  of  a sentence 
and  a picture,  a comparison  of  those  representations,  and  a response  selec- 
tion and  execution.  In  addition,  our  terminology  requires  that  two  addi- 
tional operations  be  included:  constructing  of  what  has  been  called  a 
"kernel"  representation,  and  transforming  of  the  representation,  based  on 
the  particular  modifiers  in  the  various  sentence  types.  For  example,  Clark 
and  Chase  argue  that  "below"  is  transformed  into  "not  above";  similarly, 
they  argue  that  negations  and  "truth  indices"  are  likewise  -Transformed, 
depending  upon  the  given  sentence  configuration. 


14 


I 

I 


Semantic  memory  retrieval  (Collins  and  Quillian  task) . A topic  of  con- 
siderable concern  to  psychologists  is  how  semantic  information  is  stored, 
organized,  and  retrieved.  Of  the  many  paradigms  used  to  investigate  these 
issues,  one  of  particular  interest  requires  subjects  to  make  true-false 
decisions  about  propositions  (Collins  and  Quillian,  1969).  Subjects  were 
presented  with  sentences  such  as,  "A  canary  can  fly."  or,  "A  canary  is  an 
animal."  and  were  asked  to  ascertain  the  truth  of  the  statement.  The  results 
of  the  Collins  and  Quillian  (1969)  studies  using  this  paradigm  supported  a 
theory  that  semantic  information  is  hierarchically  organized  in  memory.  Names 
of  semantic  categories  are  stored  at  the  nodes  of  a network,  along  with 
"pointers"  that  indicate  the  relationship  between  that  category  and  others 
(e.g.,  subset  or  superset  relationships  are  represented  as  a direction  to  a 
different,  lower-  or  higher-  order  node),  and  "pointers"  to  other  words  in- 
dicating properties  of  that  category.  Given  this  structural  model  and  a num- 
ber of  assumptions,  the  authors  were  able  to  make  predictions  about  retrieval 
time.  These  assumptions  are:  first,  that  both  retrieving  a property  from  a 
node  and  moving  up  a level  in  a hierarchy  take  a measurable  amount  of  time; 
second,  that  the  times  for  these  two  processes  are  additive  wherever  one  step 
is  dependent  on  the  completion  of  another  step;  and  third,  that  the  time  to 
retrieve  a property  from  a node  is  independent  of  the  level  of  the  node. 

Collins  and  Quillian  (1969)  reported  results  consistent  with  hypotheses 
generated  from  their  model.  For  example,  they  found  that  subjects  could  con- 
firm sentences  such  as  "A  canary  is  a bird."  more  rapidly  than  "A  canary  is 
an  animal.";  furthermore,  "property"  sentences  such  as  "A  canary  can  sing." 
were  more  quickly  confirmed  than,  "A  canary  has  skin.".  The  former  comparison 
was  predicted  from  the  hypotheses  that  "canaries  are  a subset  of  "birds"  which 
are  a subset  of  "animals";  in  order  to  judge  that  canaries  are  animals  the 
subject  must  first  access  the  "bird"  node,  then  the  "animal"  node.  Similar 
reasoning  applies  to  the  second  example:  "singing"  is  a property  of  canaries, 
while  "having  skin"  is  a property  of  animals. 

Subsequent  research  has  generated  other  storage  and  retrieval  models 
that  could  account  for  these  data.  However,  it  was  felt  that  this  paradigm 
was  still  useful  as  a means  of  generating  reliable  data  on  how  subjects  access 


15 


I 

I 

I 


H 

D 


a particular  (restricted)  information  structure,  especially  a structure 
which  could  conceivably  be  organized  hierarchically.  Hence,  the  Collins 
and  Quillian  paradigm  was  adapted  for  the  purposes  of  the  current  project 
but  interpreted  only  in  terms  of  the  information  structures  contained  in 
the  stimuli.  The  adaptation  involved  creating  additional  sets  of  positive 
sentences  and  generating  companion  sets  of  negative  sentences  according  to 
the  property  and  set  rules  used  with  the  positive  sentences. 

In  terms  of  operations,  both  "superset"  and  "property"  sentences  re- 
quire the  subject  to  encode  the  sentences  and  construct  kernel  represen- 
tations. Both  sentences  also  require  the  retrieval  of  superset  informa- 
tion from  L1M;  "property”  sentences  also  require  retrieval  of  property  in- 
formation. Finally,  both  sentence  types  require  the  selection  and  execu- 
tion of  the  appropriate  response. 

Recognition  memory  (Shepard  and  Teghtsoonian  task) . Shepard  and  Teght- 
soonian  (1961)  developed  a procedure  for  measuring  the  capacity  of  human 
memory  under  "conditions  approaching  a steady  state"  - where  the  possibility 
of  rehearsal  is  minimized  while  the  interference  of  preceding  material  is 
maximized.  They  argued  that  situations  which  confront  people  with  a continu- 
ing sequence  of  items  and  require  them  to  retain  as  much  as  possible  of  the 
most  recently  presented  information  (e.g.,  continuous  monitoring  of  complex 
displays)  involve  memory  processes  differing  from  those  tested  by  most  other 
paradigms.  The  procedure  they  employed  was  a recognition  task:  subjects 
were  presented  with  a lengthy  list  of  items  and  were  asked  to  identify  each 
item  as  "old"  (i.e.,  previously  presented)  or  "new".  The  lists  were  con- 
structed so  that  the  interlist  intervals  between  the  original  and  test  pre- 
sentations of  items  varied.  The  authors  were  able  to  infer  a retention 
function  for  a single  item  by  plotting  probability  of  recognition  as  a 
function  of  test  lag. 

In  addition  to  standard  parameter  estimates,  this  paradigm  is  ideal 
for  estimating  parameters  derived  from  signal -detection  theory.  Using  the 
observed  proportions  of  the  two  types  of  errors  (i.e.,  calling  an  old  item 
"new"  and  calling  a new  item  "old"),  it  is  possible  to  generate,  for  each 


16 


mujf’Sif  «*v* 


subject,  an  estimate  of  cT  and  beta  (respectively,  an  estimate  of  "true” 
discriminability,  and  the  location  of  the  subject's  subjective  decision 
bias  criterion). 

The  present  experiment  used  the  Shepard  and  Teghtsoonian  procedure; 
however,  the  stimuli  were  reconstructed  so  that  exactly  the  same  number  of 
intervals  occurred  in  a list  of  items. 

In  terms  of  operations,  this  task  requires  the  encoding  and  storing 
(in  LTM)  of  numbers  and  the  retrieval  of  these  numbers  from  LTM.  In  ad- 
dition, the  most  convenient  way  to  describe  the  recognition  judgment  is 
to  consider  it  as  a comparison  operation — subjects  compare  each  number 
with  their  LTM  set  and  judge  the  "strength  of  activation";  this  judgment 
determines  which  response  will  be  selected  and  executed. 

Summary 

Table  1 presents  an  overview  of  the  tasks  included  in  the  present 
experiment  and  the  hypothesized  operations  included  in  each  task  condition. 
The  following  method  section  details  the  specific  implementation,  procedures, 
and  scoring  rules  for  each  task.  Instructions  for  each  of  the  tasks  are  in- 
cluded in  Appendix  A. 


I 


JUOLA 

WORD 


Table  1 

Operational  Overview  of  Tasks 


-p- 

Task 

Condition 

Encode 

Construct 

Transform 

Store  Retrieve 

Search 

Compare 

Respond 

POSNER 

PHYSICAL 


POSNER 

NAME 

POSNER 

CATEGORY 


Encode  letters 


Encode  letters 


Encode  letters 


MEYER 

WORD 

Encode  letter 
string 

Construct  phone- 
mic or  graphemic 
representation 

MEYER 

NONWORD 

Encode  letter 
string 

Construct  phone- 
mic or  graphemic 
representation 

BARON  SH 
(Visual) 

Encode  seman- 
tic phrases 

Construct  visual 
representation 

BARON  HN 
(Acoustic) 

Encode  seman- 
tic phrases 

Construct  acoustic 
representation 

BARON  SN 

Encode  seman- 
tic phrases 

Construct  v'*sual 
or  acoustic 
representation 

Retrieve  name 
from  LTM 


Retrieve  name 
from  LTM 

Retrieve  category 
from  LTM 


Compare  repre- 
sentations of 
letter  patterns 

Compare  repre- 
sentations of 
latter  names 

Compare  repre- 
sentations of 
letter  categories 


STERNBERG 


JUOLA 

CATEGORY 


CLARK  AND 
CHASE 


Encode  target 
number 


Encode  target 
number 


' Encode  target 
word 


Encode  sentence 
Encode  picture 


Search  in  LTM 
for  "word” 


Search  in  LTM 
for  "word" 

Search  in  LTM 
for  "word"  rules 

Search  LTM  for 
"phrase"  rules 


Search  LTM  for 
"phrase"  rules 


Search  LTM  for 
"phrase"  rules 


Retrieve  category 
from  LTM 


Construct  kernel 
representations 


Transform  "below" 
representation 

Transform  "nega 
tion"  representation 

Transform  truth 
indices 


COLLINS  AND 

QUILLIAN 

SUPERSET 

Encode 

sentence 

Construct  kernel 
representation 

COLLINS  AND 

QUILLIAN 

PROPERTY 

Encode 

sentence 

Construct  kernel 
representation 

SHEPARD 

ANDTEGHT 

SOONIAN 

Encode 

numbers 

Store  items 
in  LTM 


Retrieve  superset 
information  from 
LTM 

Retrieve  superset  in 
formation  from  LTM 
Retrieve  property  in- 
formation from  LTM 

Retrieve  numbers 
from  LTM 


Judge  strength 
of  activation 


Compare  num- 
bers (serial) 


Compare  words 
(serial) 


Compare  cate- 
gories (serial) 


Compare 
sentence  and 
picture  repre- 
sentations 


Select  and 

execute 

response 

Select  and 

execute 

response 

Select  and 

execute 

response 


Select  and 

execute 

response 

Select  and 

execute 

response 


Select  and 

execute 

response 

Select  and 

execute 

response 

Select  and 

execute 

response 

Select  and 

execute 

response 

Select  and 

execute 

response 

Select  and 

execute 

response 

Select  and 

execute 

response 


Select  and 

execute 

response 

Select  and 

execute 

response 


Select  and 

execute 

response 


METHOD 


Y< 


I 

I 

T 

* 


r? 


D 

D 

D 

D 


Testing  facilities 

As  the  result  of  an  interservice  agreement  between  the  Office  of 
Naval  Research  (ONR)  and  the  Army  Research  Institute  (ARI),  the  first 
experiment  of  the  project  was  conducted  at  ARI's  computer-controlled 
Information  Systems  Laboratory.  The  laboratory  was  arranged  with  a 
specific  hardware  configuration  to  accommodate  the  experiment.  Five 
subject  stations,  each  consisting  of  a CRT  display  screen,  a typewriter- 
like keyboard,  and  a telephone  were  set  up,  each  in  an  individual  screened 
off  area,  but  all  in  close  physical  proximity  to  one  another.  All  soft- 
ware (including  programs,  stimuli,  and  response  buffers)  was  maintained  on 
a mountable  disk  pack  assigned  exclusively  to  the  project. 

The  software  was  developed  to  permit  up  to  four  subjects  to  be  run 
at  a time;  the  fifth  station  was  used  by  the  experimenter  to  initialize 
the  program  (e.g.,  supply  identifying  information  on  each  subject,  indi- 
cate the  session  number,  and  so  on),  and  to  monitor  and  control  the  progress 
of  the  experiment.  From  his  station  the  experimenter  could  monitor  the 
progress  of  any  individual  subject,  check  on  intermediate  results, 
initiate  each  of  the  eight  individual  tasks  being  studied,  or  restart  the 
experiment  in  case  of  hardware  failure.  Any  station  could  be  used  as  the 
experimenter  station,  and  every  station  could  communicate  with  the  selected 
experimenter  station  by  telephone.  In  addition,  any  selected  experimenter 
station  was  only  a few  feet  away  from  the  other  stations  so  that  the  ex- 
perimenter could  be  on  the  scene  quickly  to  answer  questions,  etc. 

The  software  ran  each  block  of  trials  as  a unit,  but  permitted  sub- 
jects to  proceed  through  a block  at  their  own  pace.  Further,  instructions 
both  before  and  after  practice  blocks  could  be  provided  via  the  CRT  screen. 
Stimuli  for  each  trial  were  presented  on  the  CRT  after  preset  intervals, 
and  subjects  made  their  responses  on  the  keyboard.  Feedback  ("correct" 
or  "wrong",  and  the  latency  if  "correct")  was  then  provided  on  the  CRT 
screen.  Response  latencies  were  timed  to  the  nearest  3 msec  or  better. 
Response  and  latency  information  was  recorded  in  the  computer  memory,  and 
written  on  a disk  file  at  the  end  of  each  block  of  trials.  Thus,  in  case 
of  a system  failure,  only  data  from  the  most  recent  block  of  trials,  or  at 


19 


worst  the  most  recent  task,  would  be  lost.  The  disk  response  files  were 
dumped  onto  tape  after  each  session  for  later  analysis. 


I 

I 

1 

I 

I 


11 

D 

r* 

li 

D 

D 

G 

D 

D 

D 


Procedure 

The  subjects  participated  in  two  testing  sessions,  each  approximately 
two  hours  in  length  and  scheduled  two  days  apart.  All  eight  tasks  were 
presented  in  each  session,  in  the  same  order.  Stimulus  order  in  each  task 
was  randomized,  and  where  appropriate,  different  stimuli  were  used  in  each 
testing  session. 

At  the  beginning  of  each  session  the  experimenter  conducted  a dialogue 
with  the  computer  to  provide  information  on  which  station  he  would  use,  the 
number  of  subjects  to  be  run  in  the  session,  which  station  each  would  use, 
the  session  number  (day  one  or  day  two) , and  identifying  information  on  each 
subject  so  that  his  two  sessions  could  later  be  put  together  for  analysis. 

He  could  also  indicate  that  he  was  executing  a restart  after  a hardware 
failure,  in  which  case  the  above  information  was  automatically  retrieved. 

The  experimenter  then  initiated  the  first  task.  In  this,  and  every  other 
task,  the  computer  instructed  subjects  to  read  the  appropriate  instructions 
and  call  the  experimenter  if  there  were  any  questions.  It  then  waited  for  a 
signal  from  each  subject,  indicating  that  he  was  ready.  The  computer  then 
provided  a block  of  practice  trials;  each  subject  could  proceed  through  this 
practice  block  “at  his  own  pace.  The  system  then  asked  if  there  were  any 
questions  and  waited  again.  The  experimenter  was  available  by  phone  or  in 
person  to  give  assistance.  After  all  subjects  signaled  that  they  were  ready, 
the  system  ran  each  subject  through  the  actual  task  at  his  own  pace.  When 
all  were  finished,  the  experimenter  was  notified;  he  could  at  this  point 
allow  a rest  break,  or  start  the  next  task  as  appropriate. 

Sub j ects 

The  subjects  were  54  female  and  male  students  from  Georgetown  Univer- 
sity; they  were  paid  for  their  participation  in  the  study.  Because  of 
occasional  computer  system  failures,  the  number  of  subjects  administered 
each  task  varied  slightly. 

The  following  sections  describe  the  detailed  procedure  for  each  task. 


19a 


I 


I 

I 

I 

I 

E 

r. 

o 

D 

0 

u 


U 

a 

o 


Posner  task 

Procedure.  Each  trial  began  with  two  central  fixation  points,  one  to  the 
left  of  center  of  the  display  area,  the  other  to  the  right  of  center.  After 
a foreperiod  delay,  both  points  were  replaced  simultaneously  by  letters. 

The  subject  judged  whether  or  not  the  two  letters  were  the  same  or  differ- 
ent. As  soon  as  the  subject  responded,  both  letters  were  removed.  The  in- 
tertrial interval  was  approximately  two  seconds.  The  following  schematic 
shows  the  events  in  a typical  trial: 


Subject  shown 
letter  pattern 


► Subject  responds 


physical 

name 

category 

(vowel) 


{ 

{ 

{ 


AA 

AR 

► "same" 

nR 

Pr  Uli  lLI  vllL 

AF 

aL 

AB 

► "different' 

Stimuli  and  Design.  Each  subject  classified  pairs  of  letters  under  three 
instructional  conditions:  physical  match,  name  match,  and  rule  match  (in 
that  order).  A block  of  trials  in  the  physical  or  name  match  conditions 
consisted  of  72  pairs  of  letters;  a block  of  trials  in  the  rule  match  con- 
dition consisted  of  64  pairs  of  letters.  Subjects  completed  one  block  per 
condition  per  day.  The  stimuli  themselves  were  various  combinations  of 
the  letters  A,  E,  H,  and  T.  The  order  of  pairs  within  a given  block  was 
randomized  with  respect  to  the  occurrence  of  any  letter  or  letter  pair. 

In  addition,  the  order  of  responses  was  randomized  within  each  block  with 
the  restriction  that  the  subject  (if  responding  correctly)  did  not  repeat 
the  same  response  more  than  three  consecutive  times.  Each  subject  repeated 
the  procedures  on  the  second  day  (with,  of  course,  a different  stimulus 
sequence) . 


0 


Variables. 


The  principal  data  were  the  mean  RTs  of  correct  responses 


for  the  four  baseline  measures  which  included:  "same"  judgments  for  the 


! 


19b 


r 


i 


I 


0 

II 

6 

0 

1 

I 

K 

It 


physical,  name,  and  category  conditions  and  a "different"  judgment  calcu- 
lated across  conditions.  In  addition,  two  difference  scores  were  calcu- 
lated: a name  minus  physical  match  score  and  a category  minus  name  match 
score.  The  percentage  of  errors  and  mean  error  time  across  conditions 
were  also  obtained. 

Meyer  task 

Procedure.  Each  trial  began  with  two  central  fixation  points  displayed  on 
the  CRT,  one  point  above  the  other,  which  served  as  a cueing  signal.  After 
a foreperiod  delay,  the  top  point  was  replaced  by  the  first  string  of  capi- 
tal letters;  the  subject  judged  whether  or  not  it  was  a word.  As  soon  as 
the  subject  responded,  the  top  letter  string  was  removed,  and  there  was  a 
short  delay  (500  msec)  followed  by  the  second  string  of  letters.  The  sub- 
ject again  judged  whether  it  was  a word.  Reaction  time  was  measured  sep- 
arately for  each  string  from  the  stimulus  onset  to  the  response.  The  inter- 
trial interval  was  approximately  two  seconds.  The  following  schematic  shows 
the  events  in  a typical  trial: 


Subject  shown 
letter  string 

GRINT  

FENCE  

MINT  

GUIDE  

BEARD  

Stimuli  and  Design.  During  the  eight  test  blocks  (four  on  each  day) , each 
subject  classified  a total  of  320  letter  strings;  each  block,  therefore, 
contained  40  strings  or  20  trials  of  two  strings  each.  Of  the  160  pairs, 
64  were  word-word  (WW)  pairs,  32  were  nonword-word  (NW)  pairs,  32  were  WN 
pairs,  and  32  were  NN  pairs.  Among  the  64  WW  pairs,  there  were  16  each  of 
the  following  types: 


Subject 

responds 

"nonword" 

"word" 

"word" 

"word" 

"word" 


Subject  shown 
letter  string 


PRINT 

HENCE 

PINT 

LIED 

GONE 


Sub j ect 
~ responds 

- "word" 

► "word" 

► "word" 

- "word" 

► "word" 


i 

I 

1 

■ 


« 


■ 


20 


1.  Graphemically  and  phonemically  similar  (e.g.,  FENCE-HENCE) , 

2.  Graphemically  similar,  phonemically  dissimilar  (e.g.,  MINT-PINT) , 

3.  Graphemically  dissimilar,  phonemically  similar  (e.g.,  GUIDE-LIED), 

4.  Graphemically  and  phonemically  dissimilar  (e.g.,  BEARD- GONE) . 

In  addition  to  the  WW  pairs,  there  were  a variety  of  pairs  involving  non- 
words that  followed  the  general  rules  of  English  orthography  and  phonology. 
There  were  four  types  of  such  pairs: 

1.  Word-nonword,  graphemically  similar  (e.g.,  DRUNK-FFUNK) , 

2.  Word-nonwoTd,  graphemically  dissimilar  (e.g.,  BOOT-ZAS), 

3.  Nonword-word,  graphemically  similar  (e.g.,  GRINT-PRINT) , 

4.  Nonword-word,  graphemically  dissimilar  (e.g.,  MENTH-JOY). 

It  should  be  noted  that  graphemically  similar  nonwords  could  be  pro- 
nounced like  their  word  counterparts;  thus,  there  was  probably  a degree  of 
phonemic  similarity  in  the  first  and  third  types  above.  Finally,  there  were 
two  types  of  nonword-nonword  pairs,  those  that  looked  similar  (e.g.,  GRAT- 
TRAT),  and  those  that  looked  dissimilar  (e.g.,  DOPLY-HJCKEL) . Each  test 
block  contained  two  examples  of  each  of  the  10  pair  types;  the  order  of  pair 
types  was  randomized  within  each  block  with  the  restriction  that  the  subject 
(if  responding  correctly)  did  not  repeat  the  same  response  (i.e.,  word  or 
nonword)  more  than  three  consecutive  times  nor  was  he  exposed  to  more  than 
three  consecutive  pairs  with  the  same  graphemic -phonemic  relationship. 

Variables.  The  principal  data  were  the  mean  RTs  of  correct  responses 
for  the  first  letter  string  (word  or  nonword)  of  each  pair  and  for  the  second 
letter  string  in  each  of  the  ten  WW,  WN,  NW,  and  NN  types.  Several  parameters 
were  derived  that  involved  the  four  WW  pairs.  A "phonemic  facilitation"  es- 
timate was  obtained  by  calculating  the  difference  between  the  mean  RT  from  the 
second  member  of  the  phonemically  and  graphemically  similar  pairs  and  the  con- 
trol (both  dimensions  dissimilar)  condition;  a "graphemic  interference"  score 
was  calculated  by  finding  the  difference  between  the  mean  RT  from  the  second 
member  of  the  phonemically  dissimilar,  graphemically  similar  pairs  and  the 
control.  Both  of  these  parameters  were  reported  by  Meyer  et  al.  and  were 


21 


computed  for  the  purpose  of  comparison.  In  addition,  a third  estimate  was 
made  to  determine  if  phonemic  similarity  alone  would  facilitate  recognition; 
this  estimate  was  obtained  by  calculating  the  difference  between  mean  RT  from 
the  second  member  of  the  phonemically  similar,  graphemically  dissimilar  pairs 
and  the  control.  Percentage  of  errors  and  mean  error  time  across  conditions 
were  also  calculated. 


Baron  task 

Procedure.  Each  trial  began  with  a fixation  point  which  served  as  a cueing 
signal.  After  a foreperiod  delay,  the  subject  was  presented  with  a four- 
word  phrase.  The  subject  judged  whether  or  not  the  phrase  made  sense  (as 
a function  of  the  instructional  condition) . Intertrial  intervals  were  ap- 
proximately three  seconds.  The  following  schematic  shows  the  events  in  a 
typical  trial: 


Subject  shown  phrase 


Subject  responds 


c^i  /The  sky  is  cloudy  ► "sense" 

\lt's  knot  so  ► "nonsense" 

/See  water  is  salty  "sense" 

(The  knife  is  pull  ► "nonsense" 

5N  /Please  cash  my  check  ■ "sense" 

\a  deck  of  carts  ► "nonsense" 

Stimuli  and  Design.  Each  subject  performed  in  three  conditions,  the  condi- 
tions defined  by  the  stimulus  array  and  the  instructional  set: 

1.  SH  condition,  where  the  stimulus  phrases  were  either  sense 
(S)  or  homophone  (H)  and  subjects  were  told  to  call  H phrases 
nonsense; 

2.  HN  condition,  where  the  phrases  were  either  H or  pure  nonsense 
(N)  and  subjects  were  told  to  call  H phrases  sense;  and 

3.  SN,  where  the  phrases  were  either  S or  N and  subjects  were 

simply  instructed  to  judge  the  phrases  as  sense  or  nonsense. 

• 

Each  subject  completed  two  blocks  of  20  trials  per  block  in  each  con- 
dition on  each  day.  Within  a block  of  trials,  the  order  of  phrases  was 
randomized. 


22 


Variables.  The  basic  data  were  the  mean  RTs  for  each  phrase  type  (S, 
H,  and  N)  as  a function  of  condition  (SN,  SH,  and  HN) . The  data  were  com- 
bined to  generate  overall  condition  times  (i.e.,  SN  time,  SH  time,  and  HN 
time) . Also,  the  ratio  of  SH  time  to  HN  time  was  calculated.  This  ratio 
was  used  to  categorize  subjects  as  either  visual  (low  ratio)  or  phonemic 
(high  ratio).  Both  percentage  of  errors  and  mean  error  time  across  condi- 
tions were  also  calculated. 


I 

I 

D 

0 

B 

B 

B 

: 

i 

i 

i 

i 


Sternberg  task 

Procedure.  Each  trial  began  with  the  presentation  of  the  positive  memory 
set.  On  each  trial,  a set  of  digits,  ranging  randomly  over  trials  from 
one  to  four  different  digits,  was  displayed  for  a duration  directly  pro- 
portional to  the  length  of  the  set  (one  sec.  per  number).  At  the  end  of 
this  interval,  the  displayed  set  was  removed,  and  a cueing  signal  appeared, 
followed  by  the  presentation  of  a single  probe  digit.  The  subject  decided 
whether  or  not  the  probe  digit  was  a member  of  the  positive  set.  Reaction 
time  was  measured  from  onset  of  the  probe  digit  to  the  response  execution. 
The  intertrial  interval  was  approximately  two  seconds.  Again,  below  are 
events  in  a typical  trial: 

Subject  shown  Subject  shown  Subject 

memory  set  ^ probe  digit  responds 


Stimuli  and  Design.  Each  subject  completed  a total  of  10  test  blocks  (five 
on  each  day),  each  block  consisting  of  25  trials.  The  25  trials  in  each 
block  were  composed  of  cases  for  memory  set  sizes  of  one  to  four  items. 

Half  of  the  trials  were  positive  (i.e.,  the  probe  item  was  contained  in 
the  memory  set)  and  half  were  negative.  Individual  trials  were  generated 
so  that,  for  the  positive  items,  there  was  no  bias  with  respect  to  serial 
position  in  the  memory  set  of  the  probe  digit.  The  sequence  of  trials  with 


23 


in  each  block  was  randomized  with  respect  to  memory  set  size,  with  the  re- 
strictions that  the  subject  was  not  exposed  to  more  than  three  consecutive 
trials  of  a particular  memory  set  size,  nor  was  he  required  (if  responding 
correctly)  to  repeat  the  same  response  more  than  three  times  in  succession. 

Variables . The  principal  data  were  the  mean  RTs  of  correct  responses 
for  each  memory  set  size.  The  RTs  for  both  positive  and  negative  responses 
were  used  to  calculate  (for  each  subject)  the  slope  and  intercept  of  the 
best-fitting  linear  function  relating  mean  RT  to  memory  set  size.  In  addi- 
tion, percentage  of  errors  and  mean  error  time  across  conditions  were  also 
computed. 

Juola  task 

Procedure.  The  procedures  for  the  two  conditions  (word  and  category)  were 
identical  to  each  other  and  essentially  equivalent  to  the  procedures  used 
in  the  Sternberg  task  described  above.  Each  trial  began  with  the  presenta- 
tion of  the  positive  memory  set.  On  each  trial,  a set  of  words,  ranging 
randomly  over  trials  from  one  to  four  different  words,  was  displayed  for  a 
duration  directly  proportional  to  the  length  of  the  list.  At  the  end  of 
this  interval,  a cueing  signal  appeared,  followed  shortly  by  the  presenta- 
tion of  a single  probe  word.  The  subject  decided  whether  or  not  the  probe 
word  was  included  in  the  memory  set  (for  word  condition)  or  was  an  exemplar 
of  one  of  the  categories  in  the  memory  set  (for  the  category  condition) . 
Reaction  time  was  measured  from  the  onset  of  the  probe  word  to  the  response 
execution.  The  intertrial  interval  was  approximately  two  seconds.  Below 
are  events  in  a typical  trial: 


Subject  shown 
memory  set 


Category 


^ clothing  insect 
•^clothing  color 


Subject  shown 
probe  word 


insect 

vehicle 


Subject 


"Yes" 

"No" 

"Yes" 

"No" 


MiMHianaMi  mu  m iMHikiiiiifinniMi,  nam  u 


I 

0 

0 

0 

0 

0 

B 

D 

B 

IQ 

0 

0 

0 

D 

0 

0 

fi 

D 

B 


Stimuli  and  Design.  Each  subject  completed  a total  of  eight  test  blocks 
(four  on  each  day),  four  blocks  per  condition  (the  word  condition  blocks 
were  presented  first  on  each  day),  each  block  consisting  of  30  trials.  The 
30  trials  in  each  block  were  composed  of  cases  for  memory  set  sizes  of 
one  to  four  words.  Half  of  the  trials  were  positive  and  half  were  negative. 
Individual  trials  were  generated  so  that  there  was  no  positional  bias  in  the 
presentation  of  positive  members  of  the  memory  set.  The  sequence  of  trials 
was  randomized  with  respect  to  memory  set  size,  with  the  restrictions  that 
the  subject  was  not  exposed  to  more  than  three  consecutive  trials  of  a parti- 
cular memory  set  size,  nor  was  he  required  to  repeat  the  same  response  (if 
responding  correctly)  more  than  three  consecutive  times. 


The  words  themselves  consisted  of  nine  category  names  (from  which  the 
memory  sets  in  both  conditions  were  composed  and  from  which  the  probe  stimuli 
in  the  word  condition  were  chosen)  and  nine  exemplars  of  each  category  as 
shown  below: 


A. 

Color 

(1) 

(7) 

B. 

Bird 

(1) 

(7) 

C. 

Tree 

CD 

(7) 

D. 

Fruit 

(1) 

(7) 

E. 

Fish 

(1) 

(7) 

F. 

Insect 

(1) 

(7) 

G. 

Clothing 

(1) 

(7) 

H. 

Family 

(1) 

(6) 

I. 

Vehicle 

(1) 

(7) 

Blue,  (2)  Red,  (3)  Green,  (4)  Yellow,  (S)  Black,  (6)  White, 
Pink,  (8)  Brown,  and  (9)  Gray 

Robin,  (2)  Sparrow,  (3)  Eagle,  (4)  Crow,  (S)  Duck,  (6)  Hawk, 
Parrot,  (8)  Dove,  and  (9)  Owl 

Oak,  (2)  Maple,  (3)  Cedar,  (4)  Elm,  (S)  Pine,  (6)  Spruce, 
Birch,  (8)  Poplar,  and  (9)  Fir 

Lemon,  (2)  Plum,  (3)  Apple,  (4)  Peach,  (5)  Cherry,  (6)  Grape, 
Pear,  (8)  Banana,  and  (9)  Lime 

Shark,  (2)  Trout,  (3)  Cod,  (4)  Salmon,  (5)  Sardine,  (6)  Perch, 
TUna,  (8)  Whale,  and  (9)  Flounder 

Fly,  (2)  Ant,  (3)  Bee,  (4)  Spider,  (5)  Moth,  (6)  Flea, 

Termite,  (8)  Beetle,  and  (9)  Wasp 

Shirt,  (2)  Pants,  (3)  Shoes,  (4)  Blouse,  (S)  Coat,  (6)  Dress, 
Hat,  (8)  Jacket,  and  (9)  Gloves 

Aunt,  (2)  Uncle,  (3)  Father,  (4)  Mother,  (5)  Brother, 

Sister,  (7)  Cousin,  (8)  Niece,  and  (9)  Nephew 

Car,  (2)  Bus,  (3)  Plane,  (4)  Truck,  (5)  Bike,  (6)  Train, 

Wagon,  (8)  Taxi,  and  (9)  Boat 


A 


■ i 


25 


Variables . The  principal  data  were  the  mean  RTs  of  correct  responses 
for  each  memory  set  size  in  each  condition.  For  each  subject,  the  slope 
and  intercept  of  the  best  fitting  linear  function  were  calculated  for  both 
positive  and  negative  responses.  Again,  the  error  parameters  were  computed. 

Clark  and  Chase  task 

Procedure.  Each  trial  began  with  a central  fixation  point  which  served  as 
a cueing  signal.  After  a foreperiod  delay,  the  subject  was  presented  si- 
multaneously with  a sentence  on  the  left  and  a picture  on  the  right  side 
of  the  display.  The  subject  read  the  sentence  and  decided  whether  or  not 
the  sentence  was  an  accurate  description  of  the  picture.  Reaction  time 
was  measured  from  the  stimulus  onset  to  response;  the  intertrial  interval 
was  approximately  two  seconds.  Below  are  events  in  a typical  trial: 


Subject  shown  sentence 


Star  isn't  below  cross  * 
Cross  isn't  below  star  * 


Subject  responds 


'Talse" 


Stimuli  and  Design.  Each  subject  completed  10  blocks  of  16  trials  (five 
blocks  per  day) , each  block  consisting  of  a different  random  order  of  16 
displays.  The  16  displays  consisted  of  eight  sentences  paired  with  one 
of  two  pictures. 

The  sentences  included  "Star  is  above  plus  "Star  is  below  plus 
"Star  isn't  above  plus  "Star  isn't  below  plus  ",  as  well  as  the  four 
corollary  sentences  with  "star"  and  "plus"  interchanged.  The  pictures 
were  either  an  asterisk  (star)  directly  above  a plus,  +,  or  a plus  above 
an  asterisk,  *.  Each  of  the  16  displays  could  be  characterized  along  three 
dimensions: 

1.  whether  it  was  a positive  (P)  or  a negative  (N)  sentence  (e.g., 
"Star  is  (isn't)  above  plus."); 

2.  whether  the  sentence  was  a true  description  of  the  picture  (T) 
or  a false  description  (F);  and 


26 


Thus,  in  each  block  of  16  displays,  there  were  two  examples  of  each  of 
eight  display  types:  PTA,  PTB,  PFA,  PFB,  NTA,  NTB,  NFA,  and  NFB. 


Variables.  The  primary  data  were  the  mean  KTs  of  correct  responses  to 
each  of  the  eight  display  types.  These  means  were  used  to  calculate  esti- 
mated values  for  each  of  Clark  and  Chase's  four  parameters  previously  de- 
scribed. These  parameters  were  estimated  according  to  the  following  equa- 
tions : 


a = 

5*  [ (PTB -PTA) 

+ (PFB-PFA)  + 

(NTB -NTA) 

+ (NFB- NFA)] 

b 6 d = 

\ [ (NTA- PFA) 

+ (NTB- PFB)  + 

(NFA- PTA) 

+ (NFB- PTB)] 

c = 

h [(PFA- PTA) 

+ (PFB-PTB)  + 

(NTA- NFA) 

+ (NTB-NFB)] 

V 

J*  [ PFA 

+ PTB 

NTB 

+ NFA  ] 

Both  the  percentage  of  errors  and  mean  error  time  across  conditions  were 
also  obtained. 

Collins  and  Quillian  task 

Procedure.  Each  trial  began  with  a left  fixation  point  which  served  as  a 
cueing  signal.  After  each  fixed  delay,  the  point  was  replaced  by  a four- 
or  five-word  sentence.  The  subject  judged  whether  the  sentence  was  gener- 
ally true  or  false.  The  sentence  remained  visible  until  the  subject  made 
a response.  This  cycle  was  repeated  with  an  intertrial  interval  of  approxi- 
mately two  seconds.  Reaction  time  was  measured  as  the  interval  between 
stimulus  onset  and  response  initiation.  Below  are  events  in  a typical  trial: 

Subject  shown  sentence  ► Subject  responds 

S2  < A canary  is  an  animal  ► "True" 

P2  4 A canary  has  skin  ► 'True" 

FI  4 A canary  is  a fish  ► "False" 


■ 


4 


27 


Stimuli  and  Design.  During  the  four  test  blocks  (two  on  each  day),  each 
subject  verified  a total  of  144  sentences;  each  block  contained  36  senten- 
ces of  which  half  were  true  and  half  were  false.  Each  block  contained 
six  exemplars  of  each  of  six  sentence  types.  These  sentence  types  stated 
either  property  (P)  relations  (e.g. , "Roses  are  red  ")  or  superset  (S)  re- 
lations (e.g.,  "Roses  are  flowers  ").  Each  P or  S sentence  had  a number 
added  to  it;  this  indicated  the  number  of  levels  necessary  to  "move  through" 
to  decide  whether  the  sentence  was  true.  For  example,  "A  canary  can  sing  " 
was  a PO  sentence;  "A  canary  can  fly  " was  a PI  sentence  (since  it  was  pre- 
sumed that  flying  is  a property  of  birds) ; and  "A  canary  has  skin  " was  a 
P2  sentence  (since  it  was  presumed  that  "has  skin"  is  a property  of  animals) . 
Similarly,  "A  canary  is  a canary  " was  an  SO  sentence;  "A  canary  is  a bird  " 
was  an  SI  sentence;  and  "A  canary  is  an  animal  " was  an  S2  sentence. 

The  sentences  were  generated  from  12  separate  "information  structures". 
Each  of  these  structures  consisted  of  a three- level  hierarchy  with  the  words 
ordered  by  a subset  relationship  and  properties  attributable  to  particular 
subset  levels.  For  example,  the  sentences  from  one  structure  were  as  fol- 
lows: 

SO:  A trout  is  a trout. 

SI:  A cod  is  a fish. 

S3 : A salmon  is  an  animal . 

PO:  A shark  is  dangerous. 

PI:  A flounder  can  swim. 

P2:  A barracuda  breathes. 


These  same  12  structures  were  also  used  to  generate  negative  sentences 
by  rearranging  property  and  superset  information  among  structures.  Each 
block  of  36  trials  used  the  positive  and  negative  sentences  derived  from 
three  of  the  structures. 

Variables.  The  principal  data  were  the  mean  correct  verification  times 
for  each  of  the  six  positive  sentence  types  and  A mean  correct  verification 
time  for  negative  sentences  across  levels.  The  best-fitting  linear  function 
was  calculated  to  fit  the  relationship  between  levels  of  sentences  and  mean 


28 


RTs  for  the  positive  S and  P sentences.  Both  the  percentage  of  errors  and 
mean  error  time  across  conditions  were  computed. 


Shepard  and  Teghtsoonian  task 


Procedure.  In  this  self-paced  (within  limits)  task,  subjects  decided 
whether  or  not  they  remembered  having  seen  each  number  earlier  in  the 
series.  Subjects  were  allowed  to  proceed  through  the  list  at  their  own 
speed  (maximum  of  10  seconds),  but  were  not  permitted  to  take  notes.  No 
feedback  as  to  correctness  of  responses  was  provided.  Below  are  events  in 
a typical  trial: 


Subject  shown  number 


Subject  responds 


Stimuli  and  Design.  Subjects  were  exposed  to  one  list  per  day;  the  lists 
were  each  101  items  in  length.  The  three-digit  numbers  included  in  each 
list  were  randomly  selected  from  the  total  population  from  100  through  999 
(barring  triples) . With  a single  exception,  every  number  in  a given  list 
was  presented  exactly  twice.  The  second  occurrences  of  the  numbers  were 
placed  so  that  several  lags  between  presentation  and  test  were  represented 
The  lags  used  were  1,  2,  4,  8,  12,  16,  20,  24,  30,  and  36  items,  with  five 
exemplars  of  each  lag  in  a given  list.  The  probability  of  a "new"  item 
was  slightly  higher  than  .5  in  the  first  part  of  the  list,  and  the  prob- 
ability of  an  "old"  item  was  slightly  higher  than  .5  near  the  end  of  a 


Variables 


Several  measures  were  derived  from  this  task.  These  mea- 
sures fell  into  two  categories.  First,  there  were  "traditional"  retention 
parameters,  and  second,  there  were  parameters  derived  from  signal  detection 
theory.  Of  the  traditional  measures,  the  two  employed  here  were  proportion 
correct  (i.e.,  X/101)  and  a two-parameter  estimate  of  the  best-fitting  curve 


for  the  probability  correct  by  lag  function.  This  function  was  character- 
ized by  the  least  square?  estimates  for  A and  B in  the  equation: 


g 

y = Ax  , where  A = intercept  and  B = exponent. 

These  parameters  were  calculated  for  each  subject  for  each  day. 

The  signal  detection  parameters  were  derived  from  two  observed  scores, 
namely: 

(a)  probability  that  the  subject  responded  "old"  when  the 
stimulus  was  old  (P  "0"  | 0)  or  "hits" 

(b)  probability  that  the  subject  responded  "old"  when  the 
stimulus  was  new  (P  "0"  | N)  or  "false  alarms". 

The  signal -detection  discrimination  parameter  d'  was  calculated  as 
the  normal  deviate  of  (a)  plus  the  normal  deviate  of  (b) . Beta  was  cal- 
culated as  the  normal  ordinate  of  (1  - a)  divided  by  the  normal  ordinate 
of  (b). 


0 

0 

u 

D 

0 

a 

i 


RESULTS  AND  DISCUSSION 


k 

it 


r 


ii 


The  results  of  three  sets  of  analyses  are  presented  below.  The  first 
set  deals  with  the  replicability  of  previous  experimental  work  with  highly 
similar  paradigms.  The  principal  issue  is  whether  or  not  the  major  find- 
ings were  supported  in  terms  of  group  main  effects,  despite  changes  in  the 
specific  implementation  of  each  task.  The  second  set  of  analyses  is  con- 
cerned with  the  information- processing  parameters  obtained  from  the  tasks. 
Various  aspects  of  the  measures  are  examined,  including  their  reliability, 
practice  effects,  descriptive  statistics,  and  the  character  of  the  response 
distributions  in  the  subject  population.  The  final  set  of  analyses  deals 
with  validity-type  issues.  The  inter-  and  intra-task  correlation  matrices 
are  presented  and  discussed  in  terms  of  the  constructs  represented. 

Replications  of  Group  Effects 

Posner  task.  Figure  1 presents  the  results  for  the  Posner  task  in  the 
form  of  a tree  diagram.  These  results  indicate  that  the  pattern  of  the 
Posner  and  Mitchell  (1967)  data  was  replicated  in  the  current  experiment. 

The  data,  particularly  for  Day  2 performance,  are  quite  similar  to  the 
Posner  and  Mitchell  results  in  spite  of  differences  in  the  size  and  appear- 
ance of  the  letter  displays  used  in  the  two  experiments.  The  discrepancy 
in  mean  reaction  time  (RT)  to  respond  "different"  may  be  attributable  to 
differences  in  the  approach  used  to  calculate  the  measure.  In  the  Posner 
and  Mitchell  study  "different"  RT  was  based  upon  judgments  from  the  "cate- 
gory match"  condition  only,  whereas  in  the  current  work  the  measure  is  based 
upon  pooled  judgments  across  the  three  conditions. 

In  general,  the  data  from  both  experiments  show  a monotonically  increas- 
ing RT  as  a function  of  depth  of  abstraction.  With  the  exception  of  one  sub- 
ject, all  of  the  node-to-node  processing  times  are  monotonically  increasing 
in  the  current  study.  Day  2 performance  is  consistently  faster  than  Day  1 
performance  at  all  levels  of  abstraction  by  about  60  msec. 


i 


31 


Figure  1.  Tree  diagram  for  letter  classification  task 


i 


* Tree  diagrams  for  Day  1 and  Day  2 data.  Numbers  refer  to  mean  RTs  (msec)  for  all  subjects. 
Numbers  in  parentheses  show  response  times  obtained  by  Posner  and  Mitchell  (1967). 


I 


Meyer  task.  Table  2 sumnarizes  the  mean  RTs  obtained  for  the  Meyer 
task.  As  in  Meyer  et  al.  (1974),  words  are  usually  classified  more  quick- 
ly than  nonwords.  This  is  the  case  whether  the  item  is  the  first  or  second 
letter  string  of  a stimulus  pair  and  is  characteristic  of  both  Day  1 and 
Day  2 performance. 

Meyer  et  al.  found  that  graphemic  similarity  together  with  phonemic 
similarity  facilitated  recognition  (i.e.,  the  difference  between  Type  4 
and  Type  1 stimulus  pairs  is  positive) , but  that  graphemic  similarity  alone 
inhibited  performance  (i.e. , the  difference  between  Type  4 and  Type  3 stim- 
ulus pairs  is  negative) . This  pattern  of  results  is  replicated  in  Day  1 
performance  of  the  current  experiment,  the  magnitude  of  differences  being 
even  larger  than  those  reported  by  Meyer  et  al.  Day  2 performance  also 
shows  a large  facilitatirn  effect  when  both  phonemic  and  graphemic  simi- 
larity are  present,  but  the  inhibition  effect  for  graphemic  similarity  alone 
is  essentially  zero. 

These  results  suggest  that,  on  Day  2,  subjects  may  have  adopted  a dif- 
ferent strategy  for  responding  to  the  second  letter  string  of  each  pair. 
According  to  this  strategy,  subjects  would  decide,  first,  if  the  second 
letter  string  was  both  graphemically  and  phonemically  similar  to  the  first 
string.  Second,  if  the  strings  were  not  similar,  the  subjects  would  deter- 
mine if  the  second  letter  string  was  a word  or  a nonword.  This  strategy 
would  account  for  the  Day  2 results  in  which  RTs  to  words  that  looked  and 
sounded  alike  were  fastest,  followed  by  the  other  word  conditions  regard- 
less of  graphemic  or  phonemic  similarity,  and  lastly  all  of  the  nonword 
conditions . 

In  the  current  experiment,  the  Meyer  paradigm  was  modified  to  include 
a category  of  phonemically  similar  but  graphemically  dissimilar  word  pairs. 
It  was  expected  that  the  presence  of  phonemic  similarity  alone  would  faci- 
litate recognition.  The  results  appeared  to  be  counterintuitive;  as  can 
be  seen  in  Table  2,  phonemic  similarity  alone  inhibited  performance  (i.e., 
the  difference  between  Type  4 and  Type  2 stimulus  pairs  was  positive)  in 


33 


mm 


Table  2 


Mean  RTs  (msec)  Second  Letter  Strinj 


Type  of 
Stimulus  Pair 


Phonemic  Graphemic  Meyer,  et  al 

Relation  Relation  (1974) 


Similar  Similar 


(2)  Word-Word***  Similar  Dissimilar 


Dissimilar  Similar 


602** 


Dissimilar  Dissimilar 


(5)  Word-Nonword  Similar*  Similar 


(6)  Word-Nonword  Dissimilar  Dissimilar 


(7)  Nonword-Word  Similar*  Similar 


(8)  Nonword-Word  Dissimilar  Dissimilar 


(9)  Nonword- Nonword  Similar*  Similar 

(10)  Nonword- Nonword  Dissimilar  Dissimilar 


Mean  RT  (msec)  First  Word  736 

Mean  RT  (msec)  First  Nonword  916 


* Nonwords  may  have  been  either  phonemically  similar  or  dissimilar  to  their  mates 

**  Mean  of  Meyer's  Type  2 and  Type  4. 

***  This  stimulus-type  absent  in  Meyer's  study. 


both  testing  sessions,  the  magnitude  of  the  difference  being  about  the  same 
on  both  days. 


I 

D 


Baron  task.  The  modifications  of  the  Baron  (1973;  Baron  § McKillop, 
1975)  procedure  adopted  for  the  present  study  preclude  a direct  comparison 
of  obtained  results.  Specifically,  the  Baron  (1973)  procedure  involved 
presenting  all  three  phrase  types  (Sense,  Nonsense,  and  Homophone)  in  all 
conditions;  his  conditions  differed  as  a function  of  instruction  (i.e., 
the  first  condition  required  subjects  to  judge  sense  or  nonsense  on  the 
basis  of  appearance,  and  the  second  condition  required  subjects  to  make 
their  judgments  on  the  basis  of  an  acoustic  representation).  In  addition, 
subjects  were  administered  several  hundred  trials  per  condition.  The  Baron 
5 McKillop  (1975)  procedure  used  the  3 conditions  employed  in  our  study; 
however,  in  their  study  stimuli  were  presented  in  a list,  subjects  were 
required  to  mark  each  phrase,  and  the  dependent  measure  was  time  to  com- 
plete each  list.  Nevertheless,  certain  aspects  of  the  data  are  interesting 
to  compare  with  the  results  of  our  study  (see  Table  3) . 

Baron  (1975)  tested  two  primary  hypotheses  that  have  parallels  in  the 
present  experiment.  First,  he  argued  that  there  would  be  no  difference 
between  H and  N phrases  in  the  "appearance"  condition.  In  our  terminology, 
this  comparison  can  be  translated  into  an  examination  of  N phrases  in  the 
SN  condition,  and  the  H phrases  in  the  SH  condition.  As  can  be  seen  in  the 
table,  we  likewise  found  no  differences  (on  either  day  of  testing) . His 
second  major  hypothesis  was  that  the  H phrases  would  take  substantially 
longer  than  the  S phrases  in  the  "sound"  condition.  This  translates  into 
a comparison  between  the  H phrases  in  the  HN  condition  and  the  S phrases 
in  the  SH  condition.  Again,  our  results  confirmed  Baron's  hypothesis. 

Data  from  Baron  § McKillop  (1975)  are  also  shown  in  Table  3.  Unfor- 
tunately, they  reported  data  only  for  the  five  lowest  and  five  highest 
SH/HN  ratio  subjects.  The  times -per- item  shown  in  Table  3 were  calculated 
from  their  figures.  These  data  are  not  directly  comparable  to  those  in 


35 


I 

I 

I 

I 

I 


I 

I 


4* 


D 


RMMW 


the  current  study  for  several  reasons  (e.g.,  differences  in  response  made, 
sampling  of  "extreme"  groups  of  subjects,  etc.);  however,  it  is  interesting 
to  note  that  the  HN  times  are  substantially  greater  than  the  SN  and  SH  times 
in  both  studies.  Also,  it  seems  that  with  practice,  our  subjects  make  the 
SN  decision  (which  is  ambiguous  in  the  sense  that  it  could  be  made  on  the 
basis  of  the  appearance  or  the  sound  of  the  phrases)  at  the  same  rate  as 
the  SH  decision  (which  requires  a "visual"  judgment).  This  confirms  Baron's 
hypothesis  that  the  visual  strategy  is  faster  and  more  efficient  than  the 
phonemic  strategy  and  probably  uses  information  that  is  available  "earlier" 
in  phrase  processing. 

Sternberg  task.  The  linear  functions  relating  RT  to  memory  set  size 
for  the  Sternberg  task  are  shown  in  Figure  2.  The  data  for  Day  1 and  Day  2 
performance  are  plotted  separately.  In  both  cases,  RT  increases  linearly 
with  set  size,  at  the  same  rate  for  both  positive  and  negative  responses. 

The  rate  of  increase  is  about  62  msec  for  each  item  in  the  positive  set  on 
Day  1;  the  intercept  is  about  490  msec.  The  function  for  Day  2 shows  a 
general  improvement  in  performance,  with  the  slope  about  48  msec  per  item 
and  the  intercept  about  440  msec.  These  results  support  a serial  exhaust- 
ive model  of  memory  scanning  and  compare  favorably  with  results  obtained 
by  Sternberg  (1975).  In  his  experiments,  the  slope  of  the  RT  function  was 
about  38  msec  per  item  and  the  intercept  was  about  400  msec.  The  subjects 
in  Sternberg's  study  were  tested  over  a large  number  of  trials  so  that  the 
flatter  slope  and  lower  intercept  values  obtained  in  his  study  may  be  at- 
tributed, at  least  in  part,  to  more  practice. 

Juola  task.  Figure  3 presents  the  linear  functions  relating  RT  to 
memory  set  size  for  word  and  category  conditions.  Separate  functions  were 
calculated  for  positive  and  negative  responses  and  for  Day  1 and  Day  2 
sessions.  In  both  conditions,  RT  increases  linearly  with  the  number  of 
memory  set  items;  this  is  true  for  both  positive  and  negative  responses. 

The  intercepts  for  word  and  category  conditions  are  similar,  but  the  slopes 
for  categories  are  considerably  steeper  than  those  for  words.  The  slope 


5 i 


37 


and  intercept  parameters  for  the  current  study  and  those  obtained  by  Juola 
and  Atkinson  (1971)  are  shown  in  Table  4.  The  pattern  of  their  results  was 
replicated  in  the  current  experiment;  the  parameter  estimates  for  both  test- 
ing sessions  are  quite  similar  to  theirs  despite  modifications  in  the  con- 
struction of  stimulus  items  in  the  current  study. 

An  analysis  of  the  slope  parameters  provides  some  insight  into  the 
nature  of  the  search  processes  in  the  two  conditions.  The  slopes  for  posi- 
tive and  negative  responses  in  the  word  condition  are  nearly  equal  indicat- 
ing that  the  scanning  process  for  words  was  exhaustive:  all  possible  pair- 
wise comparisons  were  made  before  deciding  whether  or  not  a match  had  oc- 
curred. In  the  category  condition,  however,  the  slope  for  positive  responses 
is  considerably  flatter  than  that  for  negative  responses.  This  would  sug- 
gest a scanning  process  for  categories  that  self- terminated  when  an  item  in 
the  memory  set  was  found  to  match  the  target.  Although  performance  in  both 
conditions  improved  from  one  day  to  the  next,  these  differences  in  scanning 
were  apparent  in  both  sets  of  results. 

Another  aspect  of  the  memory  search  task  is  the  comparison  operation 
itself.  An  analysis  of  the  slope  parameters  also  provides  some  insight  in- 
to this  operation  in  the  two  conditions.  One  alternative  was  that  in  the 
category  condition,  subjects  would  convert  the  target  word  to  its  appropri- 
ate category  name  and  then  compare  this  name  with  all  of  the  items  in  the 
memory  set.  For  example,  if  the  memory  set  was  ' 'COLOR  CLOTHING  BIRD"  and 
the  target  word  "ROBIN",  the  subject  would  convert  the  target  word  to  a 
category  ("a  robin  is  a bird")  and  then  search  the  memory  set  for  "BIRD". 

If  the  search  process  operated  in  this  manner,  the  slope  of  the  RT  function 
for  categories  would  be  similar  to  that  for  words  but  with  a higher  inter- 
cept. On  the  other  hand,  subjects  might  decide,  for  each  member  of  the 
memory  set,  whether  the  target  word  was  a member  of  that  category  (e.g.,  Is  a 
robin  a color?  Is  a robin  a clothing?  Is  a robin  a bird?).  In  this  case, 
the  slope  for  categories  would  be  greater  than  that  for  words  but  the  two 
functions  would  have  similar  intercepts.  The  results  in  Figure  3 support 
the  latter  of  these  alternatives;  the  slopes  of  the  category  functions  for 


Table  4 

Slopes  and  Intercepts  (msec)  of  the  Best-Fitting  Linear  Functions 
Relating  Mean  RTs  to  Memory  Set  Size  in  Juola  Tasks 


Measure 

Word  task 
Slopes 


Intercepts 


Categorization  task 
Slopes 


Intercepts 


Juola 

Present  Study  (1971) 
Day  1 Day 2 Data 


Positive  trials 

57 

56 

49 

Negative  trials 

46 

51 

26 

Mean 

52 

54 

IS" 

Positive  trials 

481 

446 

543 

Negative  trials 

545 

485 

617 

Mean 

513 

TO 

TO 

Positive  trials 

130 

92 

89 

Negative  triads 

209 

147 

111 

Mean 

TO 

TUT 

TO 

Positive  trials 

605 

634 

670 

Negative  trials 

578 

586 

653 

Mean 

157 

HD 

662 

both  testing  sessions  are,  on  the  average,  two  to  three  times  greater  than 
the  slopes  of  the  word  functions. 


I 

I 

I 

I 

I 


it 


y 

D 


Clark  task.  Table  5 presents  the  mean  RTs  by  sentence  type  for  Day  1 
and  Day  2 and  the  observed  latencies  obtained  by  Clark  and  Chase  (1972, 
Experiment  1) . Day  1 and  Day  2 performance  is  shown  in  Figure  4 as  a 
function  of  whether  the  sentence  was  an  affirmative  or  negative  sentence, 
whether  it  contained  "above"  (a)  or  "below"  (b),  and  whether  it  was  true 
or  false  with  respect  to  the  picture.  These  results  compare  favorably 
with  those  of  Chase  and  Clark  in  terms  of  both  overall  RTs  and  pattern  of 
the  data. 

Parameters  for  predicting  verification  RTs  for  each  testing  session 
were  calculated  according  to  the  method  described  previously.  The  esti- 
mates for  Day  1 performance  were:  "below"  time  136  msec,  "negation"  time 
829  msec,  "comparison"  time  200  msec,  and  "base"  time  1735  msec.  The  esti- 
mates for  Day  2 were  110  msec,  685  msec,  146  msec,  and  1489  msec,  respec- 
tively. The  first  three  parameters  were  found  to  account  for  98+  percent 
of  the  variance  among  the  eight  means  for  each  session.  These  results  re- 
plicated those  of  Clark  and  Chase  who  obtained  parameter  estimates  of  93 
msec,  685  msec,  187  msec,  and  1763  msec,  respectively,  accounting  for  more 
than  99  percent  of  the  variance  among  means. 

In  order  to  further  test  the  validity  of  the  four  parameter  model, 
subjects  were  divided  into  odd-  and  even-  numbered  groups.  The  four  para- 
meters based  on  the  observed  RTs  of  one  group  were  used  to  predict  RTs  for 
the  eight  sentence  types  of  the  other  group.  The  correlations  between 
observed  and  predicted  latencies  for  the  two  groups  were  both  greater  than 
r = .97,  indicating  that  the  model  was  quite  powerful  in  terms  of  account- 
ing group  performance. 

Collins  and  Quillian  task.  The  functions  relating  RT  to  sentence  level 
for  Day  1 and  Day  2 are  shown  in  Figure  5.  Although  our  subjects  were  gen- 
erally faster  at  confirming  sentences  than  those  used  by  Collins  and  Quillian 
(1969),  the  patterns  of  results  in  the  two  studies  were  quite  similar.  RT 


j 


42 


21*1 


_ J ' 1 ~ ,.J-.  .« •„ ••  rfiarw  1 n ...  JC- 


I 

I 

I 

I 

I 

0 

0 

0 

0 


u 

D 


Table  S 

Breakdown  of  Latencies  for  Eight  Types  of  Sentences  from  Clark  and  Chase  Task. 


Sentence  Type 


Sentence 


Latency  Components 


Positive 


ive^^ 


Negative 


ive^^ 


True* 


False* 


True* 


False 


above 

A is  above  B 

*0 

1689 

1432 

1744 

below 

B is  below  A 

t0  + a 

1787 

1556 

1875 

above 

B is  above  A 

t0  +c 

1896 

1597 

1959 

below 

A is  below  B 

t0  + a + c 

2143 

1754 

2035 

above 

B isn't  above  A 

to  + b + c + d 

2778 

2312 

2624 

below 

A isn't  below  B 

to+a  + b + c + d 

2755 

2321 

2739 

above 

A isn't  above  B 

t0  + b + d 

2538 

2146 

2470 

below 

B isn't  below  A 

to  + a + b + d 

2759 

2297 

2620 

Observed  RT  (msec) 

Clark  and 
Chase 

Present  Study  (1972) 

_ „ observed 

Day  1 Day  2 data 


■■  - ■ 


A Properties 
• Supersets 


A Properties 
o Supersets 


False  Sentences 


Levels  of  True  Sentences 


Figure  5.  Mean  RTs  (msec)  and  best-fit  regression  lines  for  Collins  and  Quilli=r.  ♦ask 


I 

I 

I 

I 


increased  linearly  with  the  number  of  levels  separating  memory  nodes  when 
retrieving  either  superset  or  property  information.  For  Day  1 performance, 
the  slope  of  the  two  functions  is  about  66  msec,  which  represents  an  esti- 
mate of  the  time  needed  to  retrieve  superset  information  from  LTM.  For  Day 
2,  this  figure  is  about  48  msec.  If  the  RT  functions  for  property  and 
superset  are  assumed  to  be  parallel,  the  difference  in  intercepts  ser/es 
as  an  estimate  of  the  time  to  retrieve  property  information;  that  is, 
about  125  msec  on  Day  1 and  80  msec  on  Day  2.  Our  obtained  slope  and  inter- 
cept estimates  are  generally  faster  than  those  obtained  by  Collins  and  Quillian 
who  reported  a slope  of  75  msec  and  a difference  in  intercept  of  225  msec. 

Despite  alternative  explanations  that  could  account  for  these  data, 
the  present  study  replicated  the  finding  that,  for  these  stimuli,  property 
and  superset  functions  are  parallel,  with  different  intercepts.  Whether 
semantic  memory  effects  are  explained  in  terms  of  a hierarchical  structure 
or  some  other  model,  the  process  of  moving  from  one  memory  node  to  another 
was  found  to  take  a predictable  amount  of  time.  For  present  purposes, 
this  RT  is  being  used  as  a measure  of  scanning  through  LTM. 

Shepard  and  Teghtsoonian  task.  Figure  6 summarizes  the  effect  of  delay 
on  the  accuracy  of  classifying  an  old  three-digit  number  as  "old".  Separate 
functions  are  presented  for  Day  1 and  Day  2 and  for  the  two  sessions  com- 
bined. Although  the  curves  are  very  rough,  they  represent  the  course  of 
forgetting  as  delays  get  longer.  The  likelihood  of  recognizing  an  old 
number  as  "old"  is  almost  perfect  when  the  number  was  just  seen.  The  prob- 
ability drops  to  about  .8  with  delays  of  8 intervening  items,  then  to  about 
.7  with  delays  of  20  to  36  items.  Even  with  the  maximum  delay  of  36  items, 
the  probability  of  correctly  recognizing  an  old  number  was  well  above  chance 

level.  These  results  (Table  6)  parallel  those  of  Shepard  and  Teghtsoonian 
(1961)  despite  changes  in  procedure  in  the  current  experiment.  They  reported 
"hit"  probabilities  (of  recognizing  an  old  number  as  "old")  of  about  .77  with 
delays  of  6 or  7 items  and  about  .65  for  delays  of  about  35  items.  The  prob- 
ability of  a "false  alarm"  (i.e.,  classifying  a new  number  as  "old")  in 


46 


II 

E 

E 

B 

I 


Figure  6.  Lag  function  for  Shepard  and  Teghtsoonian  task. 


47 


u 

u 

Q 

0 

Q 

U 

n 


Table  6 


Selected  Shepard  and  Teghtsoonian  Task  Parameters 


Variables 

Present 
Day  1 

study 
Day  2 

Shepard  et  al. 
(1961) 

Hits  (p("0"  given  0)) 

.73 

.77 

.77* 

False  Alarms  (p ("0"  given  N) 

.28 

.31 

.23 

d'  (in  Z scores) 

1.284 

1.335 

-- 

Beta  (Z  score  ratio) 

1.084 

0.929 

-- 

* Based  on  lags  of  six  and  seven  items. 


48 


their  experiment  averaged  about  .23.  This  figure  compares  favorably  with 
"false  alarm"  rates  of  .28  and  .31  for  the  two  sessions  in  the  current  ex- 
periment. 

Individual  Measures 

Overview.  The  above  results  indicate  that,  for  the  most  part,  the 
major  group  effects  were  replicated  in  each  paradigm.  Thus,  there  is 
demonstrated  empirical  and  theoretical  support  for  the  information  pro- 
cessing constructs  contained  in  the  tasks.  However,  the  value  of  the  para- 
digms for  an  assessment  battery  depends  primarily  on  the  measures  derived 
from  them  and  the  properties  of  these  measures  when  considered  as  potential 
individual  difference  variables.  This  distinction  between  task  effects  and 
measurement  properties  is  particularly  important  in  the  present  context 
since  most  of  the  paradigms  were  not  originally  generated  for  the  study  of 
individual  differences;  the  scientists  were  primarily  concerned  with  un- 
covering different  aspects  of  the  human  information  processing  system. 
Similarly,  these  paradigms  have  not  previously  been  considered  as  tests  per 
se;  no  thought  has  been  given  to  typical  test  development  issues.  Finally, 
the  distinction  between  group  effects  and  individual  measures  is  critical 
in  that  several  theoretically  independent  measures  can  be  obtained  from  each 
task.  For  example,  the  Shepard  and  Teghtsoonian  task  results  can  be  de- 
scribed by  a number  of  different  parameters:  the  "standard"  measure  of 
proportion  of  correct  items  (or,  more  finely,  proportion  of  "hits"  and 
"false  alarms") , the  two  parameters  of  the  exponential  equation  that  is  the 
best  fit  to  the  probability-correct-by-lag  function,  and  the  signal -detec- 
tion-theory parameters  d'  and  B. 

Given  these  considerations,  a set  of  40  variables  was  selected  for 
detailed  examination.  These  variables  are  shown  in  Table  7.  Also  shown 
in  this  table  are  the  theoretical  operations  that  these  variables  are  hypo- 
thesized to  measure.  The  operations  were  derived  primarily  from  Table  1 
above,  which  described  the  operations  involved  in  each  task  condition.  As 
can  be  seen  in  Table  7,  there  are  several  "redundancies”  in  the  operations 


49 


’Ml 


Measure 

Operations 

POSNER 

PHYSICAL 

• Encode  letters 

• Compare  representations 

of  letter  patterns 

• S elect  and  execute  response 

POSNER 

NAME 

• Encode  letters 

• Retrieve  name  from  LTM 

• Compare  representations  of 

letter  names 

•Select  and  execute  response 

POSNER 

CATEGORY 

• Encode  letters 

• Retrieve  name  from  LTM 

• Retrieve  category  from  LTM 

• Compare  representations  of 

letter  categories 

POSNER  NAME 

MINUS  PHYSICAL 

• Retrieve  name  from  LTM 

POSNER  RULE 

MINUS  NAME 

• Retrieve  category  from  LTM 

MEYER  WORD 

•Encode  letter  string 
•Construct  phonemic  or 

graphemic  representation 
•Search  LTM  for  "word" 

• Select  and  execute  response 

MEYER  NONWORD 

• Encode  letter  string 

• Construct  phonemic  or 

graphemic  representation 

• Search  LTM  for  "word" 

• Search  LTM  for  "word"  rules 

• Select  and  execute  response 

BARON  SH 

• Encode  semantic  phrases  i 

• Construct  visual  representation 
•Search  LTM  for  "phrase"  rules 

• Select  and  execute  response 

BARON  HN 

• Encode  semantic  phrases 

• Construct  acoustic  representation 

• Search  LTM  for  "phrase"  rules 

• Select  and  execute  response 

BARON  SN 

• Encode  semantic  phrases 
•Construct  visual  or  acoustic 

representation 

• Search  LTM  for  "phrase"  rules 
•Select  and  execute  response 

STERNBERG  INTER- 
CEPT POSITIVE.  NEG- 
ATIVE RESPONSES 

• Encode  target  number 

• Select  and  execute  response 

STERNBERG  SLOPE 
POSITIVE,  NEGATIVE 
RESPONSES 

• Compare  numbers  (serial) 

JUOLA  WORD  INTER- 
CEPT POSITIVE,  NEG 
ATIVE  RESPONSES 

•Encode  target  word 
• Select  and  execute  response  ; 

Table  7 

Operations  for  Each  Task  Measure 


Measure 

JUOLA  WORD  SLOPE 
POSITIVE,  NEGATIVE 
RESPONSES 

JUOLA  CATEGORY 
INTERCEPT  POSI 
TIVE,  NEGATIVE 
RESPONSES 

JUOLA  CATEGORY 
SLOPE  POSITIVE. 
NEGATIVE 
RESPONSES 

CLARK  AND  CHASE 
“BASE"  TIME 


Operations 

• Compare  words  (serial) 


• Encode  target  word 

• Select  and  execute  response 


» Retrieve  category  from  LTM 
• Compare  categories  (serial) 


• Encode  sentence 

• Encode  picture 

•Construct  kernel  representations 

• Select  and  execute  response 

• Transform  "below” 

representation 

•Transform  "negation" 
representation 

•Transform  truth  indices 


M?NuInAMEE  ‘Retrieve  category  from  LTM  "BELOVY'T^ME  ‘ 

MEYER  WORD  -Encode  letter  string  "NEGATTON'^TIMeF  *TrreprejentatUjnti0n 

•Construct  phonemic  or  WfcSjA  I IUH  I IIWC " 

graphemic  representation  CLARK  AND  CHASE  I •Transform  truth  indie 

•Search  LTM  for  "word"  "COMPARISONS" 

• Select  and  execute  response  TIME 

MEYER  NONWORD  -Encode  letter  string  COLLINS  AND  QUIL-  -Encode  sentence 

• Construct  phonemic  or  LIAN  SUPERSET  ‘Construct  kernel  representation 

graphemic  representation  INTERCEPT  -Select  and  execute  response 

• Search  LTM  for  "word"  „ i,,*-.,,. 

• Search  LTM  for  "word"  rules  AND  * Retrieve  superset  information 

• Select  and  execute  response  SUPERSET  SLOPE  r°m 

BARON  SH  .Encode  semantic  phrases  COLLINS  AND  QUIL-  -Encode  senten^T' 

• Construct  visual  representation  LIAN  PROPERTY  -Construct  kernel  representation! 

•Search  LTM  for  phrase  rules  IMTrPPPPT  „ A ^ . 

_ . i in  i enter  i •Retrieve  property  information 

• Select  and  execute  response  from 

BARON  HN  .Encode  semantic  phrases  COLLINS  AND  QUIL-  -Retrieve  superset  information 

• Construct  acoustic  representation  LIAN  PROPERTY  from  LTM 

• Search  LTM  for  "phrase"  rules  SLOPE 

• Select  and  execute  response  ...  * 

SHEPARD  AND  -Encode  numbers 

BARON  SN  -Encode  semantic  phrases  TEGHTSOONIAN  .Store  items  in  LTM 

•Construct  visual  or  acoustic  LAG  FUNCTION  EX- 

representation  PONENT,  INTERCEPT 

• Search  LTM  for  "phrase"  rules  — 

• Select  and  execute  response  TEGHTSOONIAN  * Retrieve  numbers  fror 

STERNBERG  INTER-  -Encode  target  number  p ("hit*") , PROPOR- 

CEPT  POSITIVE.  NEG-  • Select  and  execute  response  I TION  CORRECT 

ATIVE  RESPONSES  

TtERNBERG  SLOPE  .compare  numbers  (serial)  SHEPARD  AND  • Judge  strength  of  acti 

POSITIVE,  NEGATIVE  TEGHTSOONIAN  d' 

RESPONSES 

JUOLA  WORD  INTER-  • Encode  target  word  TCP MTCnnw!  AN  • Select  and  execute  re 

CEPT  POSITIVE,  NEG-  • Select  and  execute  response  I . alarm*"! 

ATIVE  RESPONSES  p ' ,al$e  al8rm*  '•  p 

NOTE  The  following  three  measures  are  presented  in  the  results  section  but  are  not  included  here 

1.  Posner  "different"  which  is  based  on  calculations  across  the  three  conditions. 

2.  Meyer  "phonemic  facilitation"  which  indicates  an  individual  subject's  propensity  towards 
phonemic  or  graphemic  encoding,  and 

3.  Baron  SH/HN  which  indicates  an  individual  subject's  propensity  towards  acoustic  or  visual  encoding. 


•Encode  numbers 
• Store  items  in  LTM 


• Retrieve  numbers  from  LTM 


• Judge  strength  of  activation 


* Select  and  execute  response 


It 


m m 

El 


measured  across  the  set  of  variables;  many  operations  are  sampled  more 
than  once.  Also,  most  variables  measure  more  than  one  operation.  These 
observations  will  be  considered  more  fully  below,  when  construct  validity 
is  discussed.  Prior  to  that  discussion,  further  data  will  be  presented 
regarding  the  measurement  properties  of  these  variables  --  namely,  their 
reliabilities,  practice  effects,  and  descriptive  statistics. 

Test-retest  reliabilities.  Table  8 summarizes  the  range  of  test- 
retest  reliabilities  obtained  for  the  set  of  40  variables.  It  should  be 
kept  in  mind  that  the  test-retest  correlations  are  not  commonly  used  as  a 
reliability  criterion.  More  typically,  split-half  (or  odd-even)  correla- 
tions are  reported.  In  the  present  case,  these  latter  measures  were  not 
reported,  but  they  are  substantially  higher  than  test-retest  correlations. 
Also,  these  test-retest  correlations  are  sensitive  to  "strategy"  changes 
on  the  part  of  the  subject.  For  example,  it  was  suggested  above  (when 
reporting  the  group  results)  that  the  data  indicate  a probable  shift  in 
some  subjects'  approach  to  some  of  the  tasks  from  one  day  to  the  next. 

This  "strategy  shift"  rationalization  of  low  test-retest  correlations  is 
particularly  appealing  for  the  Juola  tasks  where  either  of  two  different 
strategies  (i.e.,  categorize  the  target  stimulus  before  comparison  with 
the  memory  set  or  perform  this  categorization  for  each  target  item-memory 
set  pair)  would  enable  subjects  to  perform  the  task.  It  is  also  appealing 
for  the  Clark  and  Chase  task,  where  some  of  the  transformations  might  become 
unnecessary  for  some  subjects  after  a day  of  practice. 

Practice.  Table  9 summarizes  the  effect  of  practice  on  the  set  of  40 
variables  in  terms  of  those  showing  significant  (p  <_  .05)  and  nonsignifi- 
cant. t values  when  mean  RTs  for  Day  1 and  Day  2 are  compared.  For  those 
variables  showing  significant  practice  effects,  most  have  high  test-retest 
reliability.  The  exceptions  to  this  rule  are  some  of  the  Juola  measures. 
Again,  it  can  be  argued  that  since  different  strategies  could  be  used  in 
this  task,  some  subjects  changed  strategies  from  day  to  day;  if  those  sub- 


51 


Table  8 


Test-Retest  Reliabilities 

.70  < r 

Posner  category 
Posner  "different” 

Baron  Sense-Nonsense 
Baron  Homophone- Sense 


Baron  Homophone-Nonsense 
Clark  and  Chase  "negation" 

Collins  and  Quillian  intercept  property 


.50  < r < .69 


Posner  physical 
Posner  name 

Posner  category  minus  name 
Meyer  "word" 

Meyer  "nonword” 

Sternberg  slope  positive 
Sternberg  intercept  positive 
Sternberg  intercept  negative 


Juola  category  intercept  positive 
Clark  and  Chase  "base" 

Collins  and  Quillian  intercept  superset 
Shepard  proportion  correct 
Shepard  p ("hits") 

Shepard  p ("false  alarms") 

Shepard  d' 


r < .49 


Posner  name  minus  physical 
Meyer  encoding  facilitation 
Baron  SH/HN 

Sternberg  slope  negative 
Juola  word  slope  positive 
Juola  word  intercept  positive 
Juola  word  slope  negative 
Juola  word  intercept  negative 
Juola  category  slope  positive 


Juola  category  slope  negative 
Juola  category  intercept  negative 
Clark  and  Chase  "below" 

Clark  and  Chase  "comparisons" 

Collins  and  Quillian  slope  superset 
Collins  and  Quillian  slope  property 
Shepard  lag -exponent 
Shepard  lag -intercept 
Shepard  8 


I 


52 


Table  9 


Measures  Showing  Significant  and  Nonsignificar. 
Practice  Effects  (p  < .05) 


Nonsignificant  Effects 


Posner  physical 
Posner  name 
Posner  category 
Posner  "different" 

Posner  name  minus  physical 
Meyer  "word" 

Meyer  "nonword" 

Baron  Sense-Homophone 
Baron  Homophone-Nonsense 
Sternberg  slope  positive 
Sternberg  intercept  negative 
Juola  word  intercept  positive 
Juola  word  intercept  negative 
Juola  category  slope  positive 
Juola  category  slope  negative 
Clark  and  Chase  "negation" 
Clark  and  Chase  "base" 

Shepard  lag  - intercept 


Posner  category  minus  name 
Meyer  encoding  facilitation 
Baron  Sense -Nonsense 
Baron  SH/HN 

Sternberg  intercept  positive 
Sternberg  slope  negative 
Juola  word  slope  positive 
Juola  word  slope  negative 
Juola  category  intercept  positive 
Juola  category  intercept  negative 
Clark  and  Chase  "below" 

Clark  and  Chase  "comparisons" 

Collins  and  Quillian  slope  superset 
Collins  and  Quillian  intercept  superset 
Collins  and  Quillian  slope  property 
Collins  and  Quillian  intercept  property 
Shepard  proportion  correct 
Shepard  lag  - exponent 
Shepard  p ("hits") 

Shepard  p ("false  alarms") 

Shepard  d' 

ft 


jects  who  used  an  efficient  strategy  maintained  it  while  those  who  were 
inefficient  changed,  a significant  practice  effect  would  be  obtained. 

Most  of  the  variables  with  non-significant  practice  effects  are  "de- 
rived" scores,  rather  than  "baseline"  measures.  For  example,  the  Baron 
"Sense-Homophone"  and  "Homophone -Nonsense"  baseline  scores  both  showed 
significant  practice  effects;  however,  the  ratio  of  these  two  measures, 
Baron  "SH/HN",  did  not  change. 

Descriptive  statistics.  Table  10  presents  detailed  descriptive  infor- 
mation for  each  of  the  40  measures.  This  information  includes  the  number 
of  subjects  in  the  sample,  the  mean,  median,  standard  deviation,  maximum, 
minimum,  and  test-retest  correlation  (reliability)  for  each  variable.  Also 
shown  are  the  obtained  frequency  polygons  --  the  distribution  of  scores 
within  the  sample  population.  These  data  may  prove  useful  for  two  purposes: 
first,  as  comparison  data  for  future  paradigmatic  replications,  and  second, 
as  the  basis  for  an  information- processing  data  base.  The  establishment 
of  such  a data  base  would  be  invaluable  for  the  eventual  design  of  assess- 
ment instruments;  in  addition,  it  might  prove  to  be  of  substantia]  impact 
for  the  design  of  tasks  or  the  assessment  of  personnel  requirements  in  man- 
machine  systems. 

Construct  Validity 

Overview.  The  concepts  of  converging  operations  and  construct  valid- 
ity are  relatively  new  in  experimental  psychology.  While  these  concepts 
are  of  critical  importance  to  any  research  concerned  with  test  development, 
individual  differences,  and  performance  assessment,  there  are  no  formal  ex- 
perimental designs  or  statistical  procedures  extant  for  their  evaluation. 
This  is  particularly  true  in  the  context  of  information-processing  research, 
where  the  processes  and  operations  are  theorized  to  be  "real"  in  the  sense 
that  they  are  discrete  and  take  measurable  amounts  of  time.  Standard 
statistical  procedures  (e.g.,  factor  analysis)  are  of  marginal  utility  in 


54 


J&  .jk 


Table  10 

Descriptive  Measures  and  Frequency  Polygons 


N 

Mean 
Mi 

Standard  Deviation 
Maximum 
Minimum 
Reliability 


Day  1 Day  2 


585  I 547 


794  721 

443  444 


Day  1 

Day  2 

N 

50 

37 

Mean 

684 

629 

Median 

669 

625 

Standard  Deviation 

100 

71 

Maximum 

1056 

830 

Minimum 

485 

505 

Reliability 

.58 

Standard  Deviation 
Maximum 
Minimum 
Reliability 


Day  2 

37 
771 
773 
173  121 

1425  1064 

594  575 


Day  1 - Day  2 

POSNER  MEAN-"SAME"  PHYSICAL  MATCH 


St  V 
/ / \\ 

/!  \\ 

/ \ \ 


400  450  500  550  600  650  700 

Mean  RT  (MSEC) 


POSNER  MEAN-"SAME"  NAME  MATCH 


/ 'V, 


o L '-/S - ■ 

450  500  550  600  650  700  750  800  850  900  1050 
Mean  RT  (MSEC) 


POSNER  MEAN-"SAME"  CATEGORY  MATCH 


• \ 

r-T\ 


500  600  700  800  900  1000  1100  1200  1400 

Mean  RT  (MSEC) 


55 


Table  10  (continued) 


Day  1 Day  2 


MEYER "WORD 


Day  1 Day  2 


Mean 


736  647 


■§  25 
(/> 


715  i 634 


Median 


20 


Standard  Deviation  112 


1123  869 


Maximum 


Minimum 


582  522 


520  560  600  640  680  720  760  800  840  880  920  960 1 000 1 040 
Mean  RT  (MSEC) 


MEYER  "NONWORD' 


Mean  916  756 

Median  843  737 

Standard  Deviation  252  113 

Maximum  2146  1195 


Minimum 

Reliability 


2146  1195 
679  603 


I r*-\— x 

/ ' \ 

/ ' \ N' 

/ \ \ 

1 \ \ 

/ / \ x 

fj  \ \ 

Mean  RT  (MSEC) 


Day  1 Day  2 


MEYER  ENCODING 


r^r 

54 

50 

Mean 

975 

958 

Median 

978 

952 

Standard  Deviation 

78 

75 

Maximum 

1153 

1161 

Minimum 

775 

794 

1 

Reliability 

.42 

760  800  840  880  920  960  10001040108011201160 
Mean  RT  (MSEC) 


Table  10  (continued) 


BARON  SH/HN 


Median 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


.575  .625  .675  .725  .775  .825  .875  .925  975  1 025 

Mean  SH/HN 


STERNBERG  SLOPE  (POSITIVE) 


50  60  7 0 80  90  1 00  110  120  130  H0  150  190 
MSEC/I  tem 


STERNBERG  INTERCEPT  (POSITIVE) 


Mean  RT  (MSEC) 


N 

54 

48 

Mean 

442 

425 

Median 

431 

413 

Standard  Deviation 

88 

78 

Maximum 

757 

783 

Minimum 

261 

312 

Reliability 

.52 

N 

54 

48 

40 

35 

Mean 

75 

49 

tn  30 

Median 

73 

48 

a 25 

Standard  Deviation 

32 

21 

CO 

o 20 

Maximum 

190 

107 

§ 15 

Minimum 

15 

10 

10 

5 

Reliability 

.60 

0 

10 

Table  10  (continued) 


Day  1 

Day  2 

N 

54 

48 

35 

Mean 

1205 

1193 

S 30 

Median 

1172 

1181 

•§  25 

CO 

Standard  Deviation 

246 

197 

o 20 

8 15 

Maximum 

2113 

1787 

1 ,0 

Minimum 

760 

8ju  I 

5 

Reliability 

.83 

0 

Day  1 

1 1 

Day  2 

1 

40  i 

N 54  48 

Mean  1289  1187 

Median  1250  1165 

Standard  Deviation  300  241 

Maximum  2590  1979 


Minimum 

Reliability 


2590  1979 
851  852 


Day  1 Day  2 


Day  1 Day  2 

BARON  SENSE-NONSENSE 


700  800  900  1000  1100  1200  1300  1400  1500  1600  1700  1800  2100 

Mean  RT  (MSEC) 


BARON  HOMOPHONE-SENSE 


800  900  1000  1100  1200  1300  14001500  1690  1700  1800  1900  2000  2500 

Mean  RT  (MSEC) 


BARON  HOMOPHONE  NONSENSE 


— 

40  r- 

N 

54 

48 

35 

Mean 

1579 

1423 

S 30 

Median 

1558 

1450 

% 25 

Standard  Deviation 

306 

235 

o 20 

8 15 

Maximum 

2616 

2005 

cD 

“•  10 

Minimum 

1093 

1027 

5 

Reliability 

.47 

0 t 

10 

Mean  RT  (MSEC) 


V 


Table  10  (continued) 


STERNBERG  SLOPE  (NEGATIVE) 


STERNBERG  INTERCEPT  (NEGATIVE) 


Median 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


Mean  RT  (MSEC! 


N 

54 

48 

Mean 

48 

47 

Median 

48 

46 

Standard  Deviation 

00 

CN 

15 

Maximum 

121 

88 

Minimum 

-17 

12 

Reliability 

.45 

1 

54 

48 

536 

464 

523 

449 

98 

59 

847 

654 

380 

363 

1 i5!  . 

JUOLA  WORDS  SLOPE  (POSITIVE) 


N 

IT 

46 

Mean 

56 

52 

Median 

54 

52 

Standard  Deviation 

32 

24 

Maximum 

127 

176 

Minimum 

-8 

-12 

Reliability 

.19 

J 

Table  10  (continued) 


JUOLA  WORD  INTERCEPT  (POSITIVE) 


Mean  RT  (MSEC) 


JUOLA  WORD  SLOPE  (NEGATIVE) 


Median 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


MSEC/ltem 


JUOLA  WORD  INTERCEPT  (NEGATIVE) 


300  350  400  450  500  550  600  650  700  750  800  850  900  1100 
Mean  RT  (MSEC) 


N 

52 

46 

Mean 

483 

446 

Median 

453 

432 

Standard  Deviation 

102 

89 

Maximum 

876 

779 

Minimum 

304 

302 

Reliability 

.46 

52 

46 

47 

53 

45 

47 

32 

31 

123 

149 

-19 

-16 

-.00 

I N 

52 

46 

Mean 

544 

446 

Median 

498 

470 

Standard  Deviation 

145 

67 

Maximum 

1148 

766 

Minimum 

378 

331 

Reliability 

.40 

Table  10  (continued) 


Day  1 Day  2 


N 

52 

46 

Mean 

611 

637 

Median 

581 

569 

Standard  Deviation 

245 

216 

Maximum 

1833 

1582 

Minimum 

216 

444 

Reliability 

.68 

Day  1 

Day  2 

N 

52 

46 

Mean 

214 

140 

Median 

1 J n...!  .At 

218 

139 

e-p* 

wiuiiuulu  MwViuliwii 

MW 

ww 

Maximum 

530 

310 

Minimum 

57 

-3 

Reliability 

.32 

JUOLA  CATEGORY  SLOPE  (POSITIVE) 


mm 


ou  iuu  iau  iou  iou  zuu  zaj 

MSEC/I  tem 


JUOLA  CATEGORY  INTERCEPT  (POSITIVE) 


200  300  400  500  600  700  BOO  900  1500 

Mean  RT  (MSEC) 


JUOLA  CATEGORY  SLOPE  (NEGATIVE) 


0/1  I I 1 I I I I.  I .1  I I I 1 1 I.l  ITfcJ  i i tL-L/,1- 

50  80  110  140  170  20C  230  260  290  320  350  460  530 
MSEC/ltem 


Table  10  (continued) 


JUOLA  CATEGORY  INTERCEPT  (NEGATIVE) 


Mean  H I (MStCI 


CLARK  AND  CHASE  "BELOW 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


150  100  -50  0 50  100  150  200  250  300  350  400  450  500 

Mean  RT  (MSEC) 


CLARK  AND  CHASE  "NEGATION 


200  300  400  500  600  700  800  900  1000  1100  12001300  1400  2200 
Mean  RT  (MSEC) 


N 

52 

46 

Mean 

575 

595 

Median 

535 

571 

Standard  Deviation 

238 

176 

Maximum 

1532 

1109 

Minimum 

119 

202 

Reliability 

.3 

6 

5 

82 

4 

9 

50 

685 

654 

319 

779 

354 

2202 

229 

1467 

236 

.81 

COLLINS  AND  QUILLIAN  SLOPE  (SUPERSET) 


2 


00  2 


Day  1 Day  2 


Day  2 


1100  1300  1500  1700 

Mean  RT  (MSEC) 


CLARK  AND  CHASE  "BASE" 


tesajto 


Table  10  (continued) 


Day  1 — Day  2 


CLARK  AND  CHASE  "COMPARISONS 


200 


Median 


o 20 


Standard  Deviation 


183  200 


Maximum 


648  726 


-286  -218 


Minimum 


Reliability 


250  150  -50  50  150  250  36 

Mean  RT  (MSEC) 


240  160  60  40  -20 


20  40  60  80  100  120  140  160  180  200 

MSEC/Node 


N 

54 

50 

Mean 

1735 

1489 

Median 

1688 

1450 

Standard  Deviation 

404 

330 

Maximum 

2475 

2413 

Minimum 

1016 

927 

Reliability 

.59 

Median 


Stanrlarr)  Deviation 


211  197 


Maximum 


-72  -247 


Minimum 


Reliability 


COLLINS  AND  QUILLIAN  INTERCEPT  (SUPERSET) 


Mean 

S 30 
8 

Median 

1005 

1004 

S 25 

CO 

Standard  Deviation 

205 

220 

o 20 

8 15 

Maximum 

1663 

1786 

£ 

10 

Minimum 

544 

650 

5 

Reliability 

.69 

0 

500  600  700  800  SOO  1000  1100  1200  1300  1400  1500  1600  1700 
Mean  RT  (MSEC) 


COLLINS  AND  QUILLIAN  SLOPE  (PROPERTY) 


Table  10  (continued) 


SHEPARD  PROPORTION  CORRECT 


Proportion  Correct 


SHEPARD  EXPONENT  (LAG) 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


Proportion  Correct/Unit  Lag 


SHEPARD  INTERCEPT  (LAG) 


QtartHarH  Douiatinn 


Maximum 


Minimum 


Reliability 


Proportion  Correct 


N 

52 

42 

Mean 

73 

73 

Median 

74 

73 

Standard  Deviation 

7 

7 

Maximum 

36 

93 

Minimum 

47 

62 

Reliability 

.56 

Table  10  (continued) 


SHEPARD  P ("HITS"! 


Median 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


SHEPARD  P ("FALSE  ALARMS  ') 


Median 


Standard  Deviation 


Minimum 


Reliability 


0 S 10  15  M ?5  30  35  40  45  50  55  80 

Probability 


SHEPARD  D' 


Mean 

1.284 

1.335 

Median 

1.317 

1.238 

Ctmrlorrl  Hawtotinn 

40 

Maximum 

2.169  2.960 

Minimum 

.207  .614 

Reliability 

.62 

— I 


Table  10  (continued) 


SHEPARD  0 


Standard  Deviation 


Maximum 


Minimum 


Reliability 


that  there  is  typically  no  presumption  that  the  information- processing 
operations  or  their  durations  are  independent  or  non-overlapping  in  time. 
Therefore,  the  analyses  that  were  conducted  in  the  present  research  can 
be  viewed  as  "speculative":  each  analysis  provides  some  information  that 
can  be  interpreted  as  bearing  on  construct  validity  issues  but  has  not  been 
agreed  upon  by  all  researchers  as  the  "correct"  procedure. 

The  principal  data  used  as  inputs  to  the  various  analyses  are  the  ob- 
served intra-  and  inter-  task  correlations  shown  in  Table  11,  the  observed 
mean  RTs  for  each  of  the  variables,  and  a variable-by-operation  matrix  de- 
rived from  Table  7.  This  matrix  consists  of  the  variables  listed  along 
one  axis,  the  operations  listed  along  the  other,  and  the  entries  of  "1"  or 
"0"  depending  upon  presence  or  absence  of  each  operation  in  the  composition 
of  each  variable.  Using  these  inputs,  three  major  analyses  were  conducted. 
The  first  was  a general  "model-fitting"  procedure,  using  the  correlations 
and  the  variable-by-operation  matrix;  the  second  was  a stepwise  regression 
analysis  where  the  operations  were  used  to  predict  the  observed  RTs;  and 
the  third  was  a multiple  regression  analysis,  where  estimates  were  obtained 
for  the  durations  of  the  variables.  Each  of  these  analyses  will  be  discussed 
below. 


The  notion  of  converging  operations  can  be  stated  roughly 
in  terms  of  experimental  design  --  the  idea  is  to  include  in  the  same  experi- 
ment some  tasks  tnat  are  hypothesized  to  involve  a particular  process  and 
some  tasks  that  do  not.  The  pattern  of  empirical  correlations  among  the 
tasks  is  then  evaluated  and  inferences  made  about  the  validity  of  the  parti- 
cular process.  In  the  present  case,  however,  it  is  in  practice  impossible 
to  interpret  the  empirical  correlation  matrices  for  several  reasons.  First, 
in  the  40-by-40  matrix,  there  are  780  correlations;  while  it  is  conceptually 
possible  to  generate  780  hypotheses  concerning  the  magnitude  and  direction 
of  these  correlations,  it  is  simply  not  an  efficient  strategy  to  evaluate  each 
nypo iuco ±5 

it  is  far  from  apparent  what  the  correlations  between  any  two  variables  should 
be  if  (for  example)  they  have  one  operation  in  common  and  a second  operation 


Model  fittini 


rmr> 


!N3NOdX3 


ld33b31NI 

AldJdOUd 


13Sb3dOS 


■MOSIMVdWOO, 


3AI1V93N 

id3Dd31NJ 


3AI1V93N 


3d03S 


3AIiIS0d 
Id  3DM31N I 


3 A 1 1 1 SOd 


3AUW3N 

ld3DM31NJ 


3AI1V93N 

ld33H31Nl 


3 A 1 1 1 SOd 
ld3D«31NI 
3 A I 1 I SOd 
3d03S 


NH/HS 


,iNlb 33 jifl, 


U3A3H 


»3NSOd 


WHV1V  3S1V3 

unifvaoM 

SUM 

Ainiavaoad 


IdJJdJlNl 

-9V3 


133ba03 

NOUMOdObd 


Nouvinnvd 

9NJ003N3 


3WVN 
soniw  3in« 


S3  !£  S 2 3 l£  S 

1 S P 2 P W £ 2 

2 ih  v/i  ►—  3 *—  li/i  ►— 

IS  >2  Si  2 S;  12  t 

• x_j  y o i y 
wo.  zuiaeuicxuiac 

v>  Q X Q.  u»  a.  uj  uj 

SEX^|0»-0»-  0»- 

i*  I Sli*  5 5|ri  5 

N0MV9  9b38N«31S  QHOA 


c s 

UI  «T  uj  ■— 

» 3 

£ * P 2 i 

3 h H 

^ a 8 & ' 

Ul  « L K 

5|rf  5.  , 

vionr  *ivd  \ 


a uj  &.  uii  o •—  x •* 

i o ►-  o »—  I o *-  a. 

S_|X_JZ|  Z ui  >-  UJ 
7)  •«  lo  — • , z:  uj  i_j  ►—  «/» 

55  _ ^ J2  I £ S a 

•—  at  ui  uj  •—  Q.  ►—  "—u. 

k—  *c  = u">  a K B *<zco 

SO.  uj  et  a:  uj  uj  j O uj  — c • 

I vt  ui  ui  a a o.  i imga 

2 S S % % 8 2 Is  % a s s . 

* S : | </>  •/>  a.  a.  I a.  — i — i a-  cl  xi 

*W1D  SHI  IKK)  a«Vd3HS 


IVDISAHd 
SONIW  1WVN 


wbviv  3sivd 

AlIH8V80bd 


Aimavaoad 

ld3Db31NI 

-Ml 

lN3NOdX3 

-Ml 


!D3bbOD 

NOIlbOdObd 

ld3Db3lNI 

Alb3dObd 


Alb3dObd 

ld3Db3iNI 

13Sb3dflS 


3d01S 


13S«3dnS 


.NOSiavdwoD, 


..N0I1V03N. 


,*0338, 


3AI1V93N 

id3Db3INI 

3AI1V*)3N 

3d01S 

3AIlISOd 

id3Db3i.NI 

3AUISOd 

3d03S 


3AI1W3N 


id3DH31NI 


3AUV93M 

3d01S 

3A 111 SOd 
id3Db31NI 


3A I II  SOd 
3dQ3S 
3AUVD3N 
id3Db31NI 
3AI1V93N 
3dODS 
3 A I i I SOd 
id3Db31Nl 
3 A III SOd 
3d01S 


3SN3SNON 

-3SN3S 

noTTvTTtTdvT 

9NI00DN3 


SONIW  310b 

IVDISAHd 
SfWilW  3NW 


. IN  3H  333 1 0 1 


HDiVW 


HD1W 

IVDISAHd 


f 5IW  » ? SU  £ 5|rf  5 c*  Hid  * * * f P 

V3A3H  NO»V8  DN38NH31S  OHM  VKJOC  ‘iVD  VW  WV 


*mod 


ami  ms 


Pe.irson  product  mon  ent  correlat  ns  rounder)  to  the  nearest  hundredth  decimals  omitted  For  df  = 35.  a value  of  r >38  is  significant  at  the  .01  (one-tailed)  level. 


' ' V'  . 


?T 


not  in  common.  Thirdly,  there  is  no  assumed  independence  of  operations 
among  tasks;  that  is,  it  is  entirely  possible  that  all  measures  are  posi- 
tively intercorrelated  or  that  the  correlations  are  mediated  by  higher- level 
"strategies."  Therefore,  an  alternate  procedure  was  developed. 

This  alternate  procedure  involved  the  calculation  of  the  theoretical 
"distance"  between  each  pair  of  variables  in  terms  of  the  component  opera- 
tions. The  variable-by-operation  matrix  was  examined  and  pair-wise  dis- 
tances were  calculated  via  the  simple  procedure  of  counting  all  operations 
present  in  both  variables  and  dividing  that  sum  by  the  total  number  of 
operations  present  in  either  task.  (Other  distance  measures  were  tried, 
including  the  calculation  of  the  Euclidian  distance  between  tasks  and  a 
distance  measure  based  on  other  assumptions  about  the  zero-zero  operation 
match;  since  all  results  were  approximately  equivalent,  and  the  one  de- 
scribed above  is  conceptually  the  easiest  to  understand,  only  this  one  pro- 
cedure will  be  discussed.)  From  these  calculations,  a theoretical  inter- 
variable distance  matrix  was  constructed.  Finally,  the  correlation  between 
the  two  matrices  --  the  empirical  intercorrelations  and  the  theoretical  dis- 
tances --  was  calculated.  This  correlation  was  obtained  separately  for  each 
day  of  testing  and  for  two  configurations  of  the  variable-by-operation  matrix. 
The  first  configuration  included  the  operations  of  encoding,  constructing, 
transforming,  storing,  retrieving  names,  retrieving  categories,  searching 
STM,  searching  LTM,  comparing,  and  responding.  The  second  configuration 
separated  the  various  types  of  encoding:  encoding  of  letters,  letter  strings, 
phrases,  numbers,  words,  sentences,  and  pictures. 

The  results  were  as  follows:  For  the  reduced  operations  configuration 
(i.e.,  with  a single  encoding  operation),  a Pearson  product-moment  correla- 
tion of  r = .36  was  obtained  on  Day  1 and  r = .37  was  obtained  on  Day  2. 


For  the  complete  set  of  operations  (i.e.,  with  the  inclusion  of  seven  types 
of  encodings)  the  obtained  values  were  r = .22  and  r = .24  on  Day  1 and  Day 


I 

1 

I 

ll 
[ 
t 

c 

i: 
i: 

i 

c 

E 

1 


It  is  difficult  to  say  whetner  or  not  these  correlations  represent 
’’good"  or  "bad”  model  fits  Certainly,  the  fact  that  non- zero  correlations 
were  obtained  can  be  interpreted  positively,  especially  given  the  somewhat 
arbitrary  original  selection  of  operations.  Moreover,  it  is  clear  that 
"improvements"  of  the  fits  could  be  accomplished  if  the  variable-by-oper- 
ation matrix  were  modified  iteratively.  Nevertheless,  we  believe  that  this 
fairly  simple  model-fitting  procedure  is  a potentially  valuable  tool;  in 
addition,  we  are  encouraged  by  the  positive  relationship  between  the  theo- 
retical and  empirical  matrices. 

Regression  analyses.  Another  procedure  that  was  used  to  "evaluate" 
the  validity  of  the  hypothetical  operations  was  to  consider  the  operations 
as  predictors  of  the  empirical  measures  in  a regression  paradigm.  Basic- 
ally, each  obtained  measure  was  considered  as  being  composed  of  one  or  more 
operations  that  could  be  added  together  linearly  to  produce  an  observed 
response  time.  Since  there  were  35  such  measures  (the  two  Shepard  and 
Teghtsoonian  best-fit  parameters  were  dropped,  as  were  the  three  measures 
noted  at  the  bottom  of  Table  7,  since  they  cannot  be  interpreted  as  laten- 
cy measures),  there  was  a set  of  35  simultaneous  equations  to  solve.  In 
the  general  linear  models  procedure,  the  model  fit  can  be  evaluated  direct- 
ly in  terms  of  tne  obtained  multiple  R (the  proportion  of  variance  account- 
ed for  by  the  entire  set  of  predictors);  in  addition,  this  procedure  gen- 
erates parameter  estimates  as  beta  weights  (since  the  predictor  matrix  con- 
tained only  ones  and  zeros)  for  each  of  the  predictor  variables.  A step- 
wise regression  procedure  would  reveal  the  predictor  variables  that  account 
for  the  most  variance. 


Eight  regression  analyses  were  performed.  Four  of  the  eight  were  the 
general  linear  models  procedure  (using  the  complete  variable-by-operation 
matrix  and  the  reduced  matrix  for  each  of  two  days)  and  four  were  stepwise 
regressions  (of  the  same  type).  Considering  the  general  procedure  first, 
the  obtained  R2s  were  as  follows:  For  Day  1,  R2  = .72;  for  Day  2,  R2  = .70. 


For  the  reduced  operations  matrix,  these  values  were  .71  and  .70  tor  Day  1 
and  Day  2,  respectively.  All  of  these  values  were  statistically  significant 
(Fc.001,  df  = 34).  Likewise,  the  final  R2s  of  the  stepwise  regressions  were 


; 


■"^7*1  ."7 ^ 


- 


0 

0 

Q 

0 


highly  significant  for  all  four  analyses  (the  complete  and  reduced  oper- 
ations for  Days  1 and  2),  the  obtained  R2  = .66  (Fc.OOl,  df  = 34). 

The  pattern  of  parameter  estimates  for  the  general  linear  models  pro- 
cedure also  supports  the  construct  validity  of  the  operations,  in  that  the 
magnitudes  of  these  estimates  are  intuitively  in  line  with  expectations. 
Some  of  these  estimates  are  shown  below.  (It  should  be  noted  that  an  in- 
finite number  of  solutions  to  the  normal  equations  exists.  Hence,  seme 
of  the  parameter  estimates  are  biased  in  unknown  directions.  These  biased 
parameters  are  followed  by  the  letter  B.  Numbers  are  in  msec.) 


Day  1 

Day  2 

Day  1 

Day  2 

(All) 

(All) 

(Reduced) 

(Reduced) 

Constructing 

718 

771 

815 

738 

Transforming 

269 

184 

311 

209 

Searching  (LTM) 

276B 

260B 

284B 

179B 

Searching  (STM) 

180 

109 

47 

38 

Comparing 

135 

127 

196 

161 

Responding 

341 

259 

401 

289 

In  the  stepwise  regression  analyses,  constructing  and  responding  ac- 
counted for  significant  components  of  the  observed  variance. 

Conclusions.  The  results  from  several  other  analyses  (as  well  as  a 
more  extensive  presentation  of  the  above  analyses)  could  have  been  pro- 
vided; however,  the  general  pattern  of  results  is  consistent.  The  theo- 
retical operations  hypothesized  to  determine  task  performance  do,  both  em- 
pirically and  ''nductively,  account  satisfactorily  for  significant  aspects 
of  performance.  Naturally,  the  definitions  of  information-processing  con- 
structs and  the  assignment  of  these  constructs  to  variables  should  both  be 
itprativp  activities.  likewise,  the  analysis  procedures  should  be  examined 
carefully  and  hopefully  improved  upon;  it  should  be  possible  to  develop  some 


I 

l 


74 


standard  construct  validation  procedures.  However,  to  the  extent  that  the 
present  experiment  has  shed  light  on  some  of  these  issues,  further  research 
in  the  information- processing  analysis  of  performance  will  benefit. 


[I 
I 

I 

i 
I 

I 

I 

T 

ak 


fi 

0 

& 


*--ll 


Juola,  J.  F.  § McDermott,  D.  A.  Memory  search  for  lexical  and  semantic 
information.  Journal  of  Verbal  Learning  and  Verbal  Behavior,  1976, 
15,  567-575. 


Melton,  A.  W.  Individual  differences  and  theoretical  process  variables: 
General  comments  on  the  conference.  In  R.  M.  Gagnd  (ed.)  Learning  and 
individual  differences.  Columbus,  Ohio:  Charles  E.  Merrill,  1967. 


Neisser,  U. 


New  York:  Appleton-Century-Crofts , 1969. 


Posner,  M.  I.  Abstraction  and  the  process  of  recognition.  In  GH  Bower  § 
JT  Spence  (eds . ) The  psychology  of  learning  and  motivation:  Advances 
in  research  and  theory.  New  York:  Academic  Press,  1969. 


Posner,  M.  I.  § Mitchell,  R.  F.  Chronometric  analysis  of  classification. 
Psychological  Review,  1967,  7£,  392-409. 


Rose,  A.  M.  Human  information  processing:  An  assessment  and  research  battery. 
Human  Performance  Center  Technical  Report  No.  46,  Ann  Arbor,  Michigan,  19^4. 


Rose,  A.  M.  q Fernandes,  K.  An  information-processing  approach  to  personnel 
management.  Paper  presented  at  the  Office  of  Naval  Research  Contractors 
Conference  on  Job  Analysis,  Job  Design,  and  Employment  Criteria.  December 
1976. 


Juola,  J.  F.  q Atkinson,  R.  C.  Memory  scanning  for  words  versus  categories 
Journal  of  Verbal  Learning  and  Verbal  Behavior,  1971,  10,  522-527. 


REFERENCES 


Baron,  J.  Phonemic  stage  not  necessary  for  reading.  Quarterly  Journal 
of  Experimental  Psychology,  1973,  2T>,  241-246. 

Baron  J.  5 McKillop,  B.  J.  Individual  differences  in  speed  of  phonemic 
analysis,  visual  analysis,  and  reading.  Acta  Psychologica,  1975,  39, 
91-96.  ~ 

Carroll,  J.  B.  Psychometric  tests  as  cognitive  tasks:  A new  "structure 
of  intellect."  Princeton,  New  Jersey:  Educational  Testing  Service, 
May  1974. 

Clark,  H.  H.  5 Chase,  W.  G.  On  the  process  of  comparing  sentences  against 
pictures.  Cognitive  Psychology,  1972,  3,  472-517. 

Collins,  A.  M.  6 Qiillian,  M.  R.  Retrieval  time  from  semantic  memory. 
Journal  of  Verbal  Learning  and  Verbal  Behavior,  1969,  8,  240-247. 


76 


u 


Rubenstein,  H. , Garfield,  L.  5 Millikan,  J.  A.  Homographic  entries  in 

the  internal  lexicon.  Journal  of  Verbal  Learning  and  Verbal  Behavior, 
1970,  9,  487-494. 

Shepard,  R.  N.  5 Teghtsoonian,  M.  Retention  of  information  under  condi- 
tions approaching  a steady  state.  Journal  of  Experimental  Psychology, 
1961,  62,  302-309. 

Sternberg,  S.  Memory  scanning:  New  findings  and  current  controversies. 
Quarterly  Journal  of  Experimental  Psychology,  1975,  27,  1-32. 


Sternberg,  S.  Two  operations  in  character  recognition:  Some  evidence 
from  reaction-time  measurements.  Perception  6 Psychophysics.  1967, 

2,  45-53. 

Sternberg,  S.  The  discovery  of  processing  stages:  Extensions  of  Donder's 
method.  Acta  Psychologica,  1969,  30,  276-315. 


77 


mlmm* 


Introduction 

"The  research  you  are  about  to  take  part  in  is  one  phase  of  a larger 
project  designed  to  help  understand  basic  human  information  processing 
capacities  and  limitations. 

The  results  of  this  project  will  be  used  to  improve  educational  and 
vocational  guidance  programs.  The  project  will,  for  example  contribute 
to  the  matching  of  individual  qualifications  and  characteristics  as  need- 
ed for  specific  jobs  and  to  the  development  of  training  programs  for  vari- 
ous occupations  and  professions. 

Your  participation  in  this  project  will  require  attendance  at  two 
sessions  held  on  consecutive  days,  and  consisting  of  approximately  two 
hours  each  session. 

Hiring  each  session  you  will  be  asked  to  complete  a series  of  simple 
tasks.  These  tasks  involve  simple  matching  and  memory  decisions,  and  do 
not  test  your  knowledge  of  general  information,  your  intelligence  or  your 
personality. 

You  should  be  able  to  complete  the  tasks  within  2 hours.  At  the  end 
of  the  second  session  you  will  be  paid  $20  for  your  participation.  Are 
there  any  questions? 

Read  through  the  following  description  of  the  first  task,  and  ask  any 
questions  you  may  have." 


MEYER 

"In  this  first  task,  you  will  see  strings  of  letters.  Your  job  will 
be  to  decide  whether  or  not  the  letter  strings  spell  a common  English 
word.  This  is  not  a test  of  how  many  words  you  know;  rather,  we  are  mea- 
suring how  quickly  and  accurately  you  can  decide  whether  a string  of  let- 
ters is  a word. 

At  the  start  of  each  trial,  you  will  see  two  dots  on  the  screen,  one 
above  the  other.  The  appearance  of  these  two  dots  will  alert  you  that  a 
letter  string  is  about  to  appear.  After  a short  delay,  the  top  point  will 
be  replaced  bv  a string  of  letters.  You  will  decide  whether  or  not  it  is 
a word.  If  it  is,  press  the  key  on  your  left,  as  quickly  as  you  can.  If 
the  letter  string  does  not  spell  a word,  press  the  key  on  your  right,  as 
quickly  as  you  can.  After  you  make  your  response,  there  will  be  a short 
delay  and  then  the  lower  dot  will  be  replaced  by  a second  letter  string. 
You  will  again  judge  whether  or  not  this  string  is  a word.  If  it  is,  you 
will  again  press  the  left  key;  if  it  is  not,  press  the  right  key. 


, n 

II 

J 1 1 

U 


Each  trial  will  consist  of  two  letter  strings  and  two  responses. 
After  each  trial,  the  screen  will  provide  information  as  to  whether  or 
not  your  responses  were  correct  and  how  rapidly  you  made  a correct 
response . 

The  first  block  of  trials  will  be  for  practice.  This  will  be  fol- 
lowed by  four  more  blocks,  with  a short  rest  between  each. 

Let  us  emphasize  again  that  the  English  words  used  in  this  task  are 
common,  familiar  words.  Work  as  quickly  and  as  accurately  as  you  can. 
Any  questions?  If  you  are  ready  to  begin,  please  press  the  button  on 
your  right." 


CLARK  and  CHASE 

"In  this  second  task,  we  will  be  measuring  how  quickly  and  accurate- 
ly you  can  determine  whether  a sentence  is  true  or  false.  You  will  see  a 
number  of  short  sentences,  each  paired  with  a picture.  The  sentences 
will  be  on  the  left  and  the  picture  on  the  right.  The  picture  shows  either 

a star  directly  above  a plus  * or  a plus  directly  above  a star  +.  The 

+ * 
sentences  claim  to  describe  the  pictures.  For  example,  the  sentences, 

"STAR  IS  ABOVE  PLUS."  and,  "STAR  ISN'T  BELOW  PLUS."  are  both  true  descrip- 
tions of  the  picture,  *. 

Your  job  is  to  read  each  sentence  and  to  decide  whether  it  is  a true 
or  a false  description  of  the  picture.  If  you  think  that  the  sentence 
describes  the  picture  correctly,  press  the  left  hand  key.  If  you  think 
that  the  sentence  does  not  give  a true  description  of  the  picture,  press 
the  right  hand  key.  You  are  to  work  as  quickly  as  you  can  without  making 
mistakes. 

Each  trial  will  begin  with  the  appearance  of  a dot  on  the  left  side 
of  the  screen.  Following  a short  delay,  the  dot  will  disappear  and  the 
sentence  and  picture  will  be  shown.  After  you  make  your  response,  the 
screen  will  display  the  word  "CORRECT"  and  your  response  time  if  you  were 
correct  or  the  word  "WRONG"  if  you  were  incorrect. 

The  first  list  you  will  see  is  a practice  list.  This  will  be  follow- 
ed by  five  blocks  of  sentences  with  a short  rest  between  blocks.  Any 
questions?" 


STERNBERG 

"This  third  task  measures  how  quickly  and  accurately  you  can  recognize 
items  that  you  have  just  seen.  On  each  trial,  you  will  see  one  or  more 
numbers  on  the  left  side  of  the  screen.  You  will  be  given  a few  seconds 


80 


to  memorize  the  numbers,  after  which  they  will  disappear.  Following  this 
study  period,  a single  number  will  appear  on  the  right  side  of  the  screen. 
You  will  decide  whether  or  not  this  number  was  one  of  the  numbers  you  have 
just  memorized. 

If  the  test  number  was  present  in  the  memory  list,  press  the  left  key. 

If  the  test  number  was  not  part  of  the  memory  list,  press  the  right  key. 

After  each  response,  the  screen  will  provide  information  as  to  whether 
or  not  your  response  was  correct  and  how  rapidly  you  made  a correct  response. 
The  first  block  of  trials  will  be  for  practice. 

The  practice  block  will  be  followed  by  five  test  blocks,  with  a short 
rest  between  each  block.  Please  work  as  quickly  as  you  can  without  making 
any  errors.  Any  questions?" 


COLLINS  AND  QU ILLIAN 

"This  fourth  task  measures  how  quickly  and  accurately  you  can  dec ide 
if  a sentence  is  true  or  false.  Each  sentence  is  a factual  statement 
like  ZEBRAS  HAVE  STRIPES.  If  a sentence  is  generally  true  call  it  true 
and  if  it  is  generally  false  call  it  false.  Do  not  waste  time  thinking 
of  exceptions  to  generally  true  or  generally  false  sentences;  you  will 
easily  be  able  to  tell  the  difference  between  the  two. 

Occasionally  you  will  see  a redundant  sentence,  such  as  A BOAT  IS  A 
BOAT.  These  sentences  serve  as  a comparison  for  the  other  sentences. 

Treat  them  as  true. 

Each  trial  begins  with  a dot  on  the  screen  to  warn  you  that  a sentence 
is  about  to  appear.  When  the  sentence  appears,  respond  by  pressing  the 
left  key  if  the  sentence  is  generally  true;  if  the  sentence  is  generally 
false,  press  the  right  key. 

The  first  set  of  sentences  will  be  for  practice.  This  practice 
block  will  be  followed  by  two  test  blocks  with  a short  break  between 
them.  Following  each  response  there  will  be  a short  delay,  during  which 
response  time  and  error  information  will  appear  on  the  screen.  Any  ques- 
tions?" 


POSNER 


Physical 

"In  this  fifth  task,  you  will  be  making  a series  of  simple  judgments. 
In  the  first  condition  we  are  concerned  with  how  accurately  and  rapidly 
you  can  decide  whether  two  displays  are  physically  the  same  or  different. 
You  will  see  pairs  of  letters,  one  on  the  left  side  of  the  screen  and  one 


81 


on  the  right  side.  Your  job  is  to  judge  whether  the  two  letters  are  phys- 
ically the  same  or  if  they  are  different.  For  example,  the  letter  pair  AA 
would  be  judged  as  the  same,  while  the  pair  AB  would  be  judged  as  different 

If  they  are  the  same,  press  the  left  key;  if  they  are  different,  press 


After  you  respond,  the  screen  will  display  how  fast  you  responded  and 
whether  or  not  you  were  correct.  Please  respond  as  rapidly  as  you  can, 
without  making  errors. 

Before  the  letters  appear,  two  dots  will  be  displayed.  These  dots 
will  be  your  warning  signal  - a few  seconds  after  the  dots  appear,  the 
letters  will  come  on.  The  first  block  of  trials  is  for  practice.  Any 
questions?" 


"In  this  second  condition  you  will  again  see  pairs  of  letters.  This 
time,  you  are  to  decide  whether  or  not  the  letters  have  the  same  name. 

For  example,  the  letter  pair  Aa  would  be  judged  as  the  same,  while  the 
pair  Ab  would  be  judged  as  different.  If  they  are  the  same,  respond  by 
pressing  the  left  key;  if  they  have  different  names,  press  the  right  key. 
Work  as  quickly  as  you  can  without  making  any  errors.  The  first  block  of 
trials  is  for  practice.  Any  questions?" 


"In  this  third  condition,  you  will  again  see  pairs  of  letters.  This 
time,  you  are  to  decide  whether  or  not  the  two  letters  belong  to  the  same 
category.  If  the  two  letters  are  both  vowels  or  both  consonants , respond 
by  pressing  the  left  key.  For  example,  the  letters  AO  are  both  vowels 
and  should  be  judged  as  same;  the  letters  TN  are  both  consonants  and  would 
also  be  judged  as  same.  If  these  letters  are  different  - one  is  a vowel 
and  the  other  is  a consonant  - press  the  right  key.  For  example,  the  letter 
pair  AB  would  be  judged  as  different.  Work  as  quickly  as  you  can  without 
making  any  errors.  The  first  block  of  trials  is  for  practice.  Any  questions 


BARON 


Sense -Nonsense 


nonsense  phrases;  the  difference  between  sense  and  nonsense  will  be  obvi 
ous. 


At  the  start  of  each  trial,  you  will  see  a dot  on  the  left  side  of  the 
screen.  The  appearance  of  the  dot  will  alert  you  that  a phrase  is  about  to 
appear.  After  a short  delay,  the  dot  will  be  replaced  by  the  phrase.  You 
will  decide  whether  or  not  it  makes  sense.  If  it  does,  press  the  key  on 
your  left.  If  it  does  not,  press  the  key  on  your  right. 

After  each  trial,  the  screen  will  display  your  response  speed  and  ac- 
curacy. The  first  few  trials  will  be  for  practice.  These  practice  trials 
will  be  followed  by  two  test  blocks  with  a short  rest  between  blocks.  Any 
questions?" 


"In  this  second  condition,  you  will  again  see  a series  of  ph. ases  and 
you  will  again  decide  whether  or  not  the  phrases  make  sense.  However,  in 
this  condition,  some  of  the  phrases  have  been  constructed  so  that  if  you 
say  them,  they  would  sound  as  if  they  made  sense.  Therefore,  you  must  be 
careful  to  judge  the  phrases  as  sense  or  nonsense  on  the  basis  of  how  they 
look  and  not  on  the  basis  of  how  they  sound.  As  before,  if  a phrase  makes 
sense,  respond  by  pressing  the  left  key;  if  it  does  not  make  sense,  press 
the  right  key.  The  first  few  trials  will  be  for  practice.  These  practice 
trials  will  be  followed  by  two  test  blocks.  Again,  work  as  quickly  and  as 
accurately  as  you  can.  Any  questions?" 


Nonsense  Homophone 


"In  the  next  condition,  you  will  again  see  a series  of  phrases  and  you 
will  again  decide  whether  or  not  the  phrases  make  sense.  However,  in  this 
condition  all  phrases  will  look  like  nonsense.  You  are  to  judge  whether 
or  not  the  phrases  make  sense  by  their  sound.  If  a phrase  sounds  like  it 
makes  sense,  press  the  left  key;  if  it  sound  like  nonsense,  press  the  right 
key. 


Again,  work  as  quickly  and  as  accurately  as  you  can.  The  first  few 
trials  will  be  for  practice,  followed  by  two  test  blocks.  Any  questions? 


JUOLA 


Words 


"Thp  spvpnth  task  consists  of  two  parts.  The  first  part  measures  how 
quickly  and  accurately  you  can  recognize  items  that  you  have  just  seen. 

On  each  trial,  you  will  see  from  one  to  four  words  on  the  left  side 
of  the  screen.  You  will  be  given  a few  seonds  to  memorize  the  words. 


After  this  study  period,  a single  word  will  appear  on  the  right  side 
of  the  screen.  You  will  decide  whether  or  not  this  word  was  one  of  the 
words  you  have  just  memorized. 

If  the  test  word  was  present  in  the  memory  list,  press  the  left  key. 

If  the  test  word  was  not  part  of  the  memory  list,  press  the  right  key. 

After  each  response,  the  screen  will  provide  information  as  to  whether 
or  not  your  response  was  correct  and  how  rapidly  you  made  a correct  response. 

The  first  block  of  trials  will  be  for  practice.  The  practice  block  will 
be  followed  by  two  test  blocks,  with  a short  rest  between  each  block.  Please 
work  as  quickly  as  you  can  without  making  any  errors.  Any  questions?" 


"As  you  may  have  noticed,  each  of  the  words  used  in  the  first  part  of 
this  task  was  the  name  of  a category.  The  second  part  of  this  task  is  simi- 
lar to  the  first  in  that  you  will  again  be  shown  from  one  to  four  category 
words  on  the  left  side  of  the  screen  for  you  to  memorize. 

Again,  a single  test  word  will  appear  on  the  right  side  of  the  screen. 
However,  this  time  you  will  decide  whether  or  not  the  test  word  is  an  ex- 
ample of  one  of  the  categories  in  the  list  you  have  just  memorized.  For- in- 
stance, suppose  that  you  see  the  words  COLOR  and  RELATIVE  on  the  left  side 
for  you  to  memorize.  The  test  word  might  be  BLUE.  Since  this  is  an  example 
of  the  category  COLOR,  you  would  respond  by  pressing  the  left  key.  If  the 
test  word  had  been  HORSE,  you  would  respond  by  pressing  the  right  key  since 
HORSE  is  not  an  example  of  either  COLOR  or  RELATIVE. 

As  before,  the  screen  will  provide  information  concerning  your  response 
speed  and  accuracy  after  each  trial.  The  first  block  of  trials  will  be  for 
practice.  This  practice  block  will  be  followed  by  two  test  blocks  with  a 
short  rest  between  them.  Please  work  as  quickly  as  you  can  without  making 
any  errors.  Any  questions?" 


SHEPARD 

"This  eighth  task  is  a little  different  from  the  rest  of  the  tasks. 
This  is  a test  of  how  well  you  can  recognize  numbers  that  you  have  pre- 
viously seen. 

You  will  be  shown  a series  of  three-digit  numbers.  For  each  three- 
digit  number  you  must  decide  whether  or  r.ct  you  have  seer,  that  number  be- 
fore in  the  series.  If  you  have  seen  it,  respond  by  pressing  the  left 
key.  If  you  have  not  seen  that  number  before,  press  the  right  key. 

To  illustrate,  look  at  the  following  sequence  of  numbers  and  the  cor- 
rect responses  to  each:" 


84 


Number  Response 


100 

New 

(right  key) 

200 

New 

(right  key) 

100 

Old 

(left  key) 

300 

New 

(right  key) 

200 

Old 

(left  key) 

"The  first  number  is,  of  course,  always  "New".  This  task  does  not  mea- 
sure how  quickly  you  respond;  however,  there  is  a 10-second  time  limit  for 
each  response.  If  you  take  longer  than  10  seconds,  a "TOO  SLOW"  message 
will  appear  on  the  screen  and  the  next  number  in  the  list  will  be  shown. 
Therefore,  you  will  have  to  work  fairly  rapidly.  Within  these  constraints, 
try  to  work  as  accurately  as  you  can. 

In  order  to  minimize  confusion,  there  will  not  be  any  practice  for  this 
task.  Therefore,  if  you  have  any  questions,  ask  them  now." 


85 


